Research Article  Open Access
Novel Model for Cascading Failure Based on Degree Strength and Its Application in Directed Gene Logic Networks
Abstract
A novel model for cascading failures in a directed logic network based on the degree strength at a node was proposed. The definitions of indegree and outdegree strength of a node were initially reconsidered, and the load at a nonisolated node was proposed as the ratio of indegree strength to outdegree strength of the node. The cascading failure model based on degree strength was applied to the logic network for three types of cancer including adenocarcinoma of lung, prostate cancer, and colon cancer based on their gene expression profiles. In order to highlight the differences between the three networks by the cascading failure mechanism, we used the largestscale cascades and the cumulative cascade probability to depict the damage. It was found that the cascading failures caused by hubs are usually larger. Furthermore, the result shows that propagations against the networks were correlated with the structures motifs of connected logical doublets. Finally, some genes were selected based on cascading failure mechanism. We believe that these genes may be involved in the occurrence and development of three types of cancer.
1. Introduction
Over the past few decades, many scientists focused on the study of cascading failures in different networks, such as the electrical power networks [1–3], traffic networks [4, 5], Internet networks [6], social networks [7, 8], and even biological networks [9, 10]. The various models of cascading failures and their mechanisms, as well as their prevention, have been proposed. For instance, Motter and Lai [11] proposed a loadcapacity cascading failure model and simulated an arbitrary power exponent of scalefree networks. The results showed that loads would redistribute among the nodes, and intentional attacks would lead to a cascade of overload failures, which could cause the entire part of the network to collapse. Wang and Xu [12] investigated cascading failures in coupled map lattices with different topologies and found that cascading failures are much easier to occur in smallworld and scalefree coupled map lattices than in globally coupled map lattices. Crucitti et al. [13] presented a simple model for cascading failures based on the dynamical redistribution of the flow in the network, showing that the breakdown of a single node is sufficient to reduce the efficiency of the entire system if the node is among those with the largest load.
Recently, some researchers focused on the cascading failure mechanisms for directed networks. Fang et al. [14] proposed the cascading failure model in the context of directed complex networks. They used two attack strategies including minimum indegree and the maximum outdegree attack strategy, which were compared with random attack strategy through simulations. Numerical results show that the cascading failure propagation in directed complex networks is highly dependent on the attack strategies and the directionality of the network. Jin et al. [15] built the loadcapacity cascading failure model of the directed and weighted network. They applied the models to two typical real networks, namely, the Poisson distribution network and power law distribution network. Through simulation analyses, they concluded that the average weight and the average indegree should be increased, respectively, for enhancing the resistibility of overloading and shortloading failures. Smart et al. [9] investigated the relationship between structure and robustness in the metabolic networks of Staphylococcus aureus and so on using a cascading failure model based on a topological flux balance criterion.
Despite this success, few studies have attempted to identify the cascading failure mechanism in a directed gene logic network. In this study, we investigate a loadcapacity cascading failure model based on the degree strength of nodes and identify the influence of cascading failures on the gene logic networks. The directed network is constructed. The definitions of indegree, outdegree, and degree strength are refined for different regulation types of secondorder logical relationships. Then a novel algorithm for cascading failure based on loadcapacity model is investigated. The load at a node is defined as the ratio of the indegree strength to the outdegree strength of the node. The capacity of a node is the interval from the minimum load to the maximum load that the node can handle. By removing a particular gene node initially, the corresponding number of cascading failure nodes generated is noted. This process is repeated for each gene node in the network. The parameters, that is, the probability that a gene node will yield damage greater or equal to , as well as the largest size ratio of cascading failure, are used to detect the relationship between network structure and robustness. Applying the model to gene expression profiles data for adenocarcinoma of lung, prostate cancer, and colon cancer, we find that hubs connected with other nodes by logical motif are more likely to break down. The study of cascading failure for gene networks may provide useful information underlying the biological mechanism of the formation and the development of cancers.
2. Methods
2.1. InDegree and OutDegree in the Logic Network
Bowers et al. [16, 17] proposed the logic analysis of phylogenetic profiles (LAPP) and demonstrated the benefits of identifying the relationships among gene triplets, as they have a greater likelihood of yielding the network organization of the interactions among gene triplets which forms the gene logical network. In fact, it can be considered as a weighted and directed graph that deciphers different logic interactions among gene node, including firstorder and secondorder logical relationships by the uncertainty coefficient at some thresholds (for details about the gene logical network, see Wang et al. [18] and Zhang et al. [19]).
In the firstorder logical relationship, taking , its uncertainty coefficient is defined aswhich measures the probability that gene regulates gene , where and are the Shannon entropies for vectors and , respectively, and is the joint entropy of and . This regulatory relationship is denoted as a weighted and directed edge. Figure 1 gives three topologies for indegree and outdegree for node of 1storder logical relationship. Obviously both the indegree of and the outdegree of are increased by one for . A secondorder logical relationship as shown in Figure 2, for example, , has an uncertainty coefficient denoted as that measures the probability of existence of this secondorder logical relationship. In this formula, is the logical function. The uncertainty coefficient of can be calculated by
The secondorder logic relationship can be considered to be a directed edge with the weight . As in the LAPP method, all such gene triplets, with the corresponding values, give rise to the gene logic networks further studied in our present work.
The definitions of the indegree and outdegree need to be refined for different regulation types of secondorder logical relationships, namely, AND, OR, and XOR. We propose these new definitions based on two principles: the sum of the indegree and that of the outdegree of all the nodes in a network are equal, and the definition must be consistent with that of the degree and strength in firstorder logical relationships.
Based on these two principles, the indegree and outdegree of secondorder logical relationships are defined as follows. If regulates (i.e., 1 appears times), then the indegree of is increased by . However, the outdegrees of and are determined by the proportion of their contributions to the secondorder logical relationship. We can calculate the proportion based on the gene expression data particularly applied to gene networks in this research. Moreover, the second principle is meaningful only when it comes to the OR logic, as and regulate simultaneously for AND logic.
For the XOR logic, we cannot determine how and regulate (cooperatively or independently) merely from their gene expression profiles. For example, the specific algorithm to calculate the proportion of contribution from to that from is depicted by the third proper function . In a gene expression profile, components “1” and “0” denote the presence and the absence of the gene, respectively. An matrix with element 1 or 0 denote the gene expression profiles of genes , , and expressed in columns, where is the dimension of these vectors. Each row of the matrix is a threedimensional vector, and each column is an dimensional vector. Let be the frequency of row , which indicates that both and activate ; let be the frequency of row which indicates that only activates ; and let be the frequency of row , which indicates that only conducts the activation. The outdegree added to by this secondorder logic is times the total outdegree (i.e., the indegree increment of ), and the outdegree distributed to is times the total outdegree. Specifically, for the OR logic, the outdegree increments of both and are according to the gene expression profiles of nodes in a network, and the indegree increment of is . For the AND logic, the outdegree increments of both and are and the indegree increment of is . For XOR logic, the outdegree increments of both and are , and the indegree increment of is 1. Table 1 lists the different types of logic relationship as well as their corresponding indegrees and outdegrees.

2.2. Model of Cascading Failure for the Logic Network
Definition 1 (indegree strength and outdegree strength of a node). Suppose that there are nodes regulating node only by the firstorder logical relationship. Therefore, the indegree strength of node is defined as , where denotes the uncertainty coefficient of gene node controlling gene node . On the contrary, if node is the source gene node regulating other nodes by the firstorder logical relationship, then the outdegree strength of node can be defined as . Considering logical triplets, if node is the target node of node and just by secondorder logical relationships, then the indegree strength of node is defined as
On the contrary, if node and other nodes commonly regulate node only by secondorder logical relationships, then the outdegree strength of node is defined as , where corresponds to the types of secondorder logic shown in Table 1. Finally, the total indegree strength (outdegree strength) of node is the sum of all indegree strength (outdegree strength) of node generating from both firstorder logical relationships and secondorder logical relationships.
Definition 2 (load at a node). For a nonisolated node , its load can be defined in terms of its local information as the ratio of the indegree strength to the outdegree strength. Specifically, if the indegree strength of node is equal to zero and the outdegree strength of node is , then its load . If the indegree strength of node is equal to and the outdegree strength of node is zero, then its load . If the indegree strength of node is equal to and also the outdegree strength of node is , then its load .
Definition 3 (capacity of a node). Two capacities in node are defined: for node , the lower limit of capacity is and the upper limit of capacity is , where parameter . Three cases are presented as follows: if , then , and the interval shrinks to a point. If , then , and we consider that the interval becomes . If , it forms a real interval from the minimum load to the maximum load which the node can handle.
When all the nodes are active, the network operates in a freeflow state [11]. However, the removal of a node may cause the loads in other particular nodes to be redistributed to other components. The redistribution may cause the load to increase or decrease beyond the range of its initial capacity interval. In particular, the load may decrease from the positive value to 0 or increase to . Thus, the corresponding nodes would collapse. As a result, subsequent failures would occur. Although it may stop after a few steps, it may also propagate and shut down a considerable fraction of the whole network. The cascading failure model depending on degree strength (DSCFM) and the mechanism and the relationship between structure and stable are studied to control cascading failure against the gene logic network.
Let be a logic network with the gene node set , the directed edge set , and the edge weighted set . Suppose the logic network does not have multiple edges and selfloops. On the basis of the abovementioned definitions and symbols, we propose an algorithm as follows.
Input. Initial matrix of the logic network .
Step 1. Initially select a node , and then calculate its load and capacity , .
Step 2. Delete node and its incident edges (both incoming and outgoing edges).
Step 3. Calculate the current load of remaining node and compare it with the initial capacity. Then delete any node which fails along with each of remaining edges.
Step 4. Repeat Step 3 until the failure will not happen.
3. Results
The real gene expression data are all downloaded from the Gene Expression Omnibus (CEO). All databases were based on the Gene Chip Human Genome U133A. The lung normal group was recorded as the control group I and the lung adenocarcinoma as the experimental group I. The prostate normal group was recorded as the control group II and the prostate cancer as the experimental group II. Similarly, the colon normal group was recorded as control group III and the colon cancer as the experimental group III. The specific situation is shown in Table 2. Furthermore, by using the Console Expression Software provided by Affymetrix Company, we obtain their value, value, and corresponding values, where represents Present (expression), represents Absent (not expressed), and represents Margin. The value in the database is recorded as 1, and the values of and are all recorded as 0.

However, there are few samples with too many genes (beyond 20000 genes) in each data set. We shall choose significant difference genes between the control groups and the experimental groups for the three types, respectively. We select candidate genes on the Wilcoxon rank sum test [19] at the significance level by the corresponding values. Finally, 60, 65, and 79 genes were filed out from initial data and finally their expression matrices were obtained where each row represents a gene and contains a binary string of 0’s and 1’s to indicate the presence or the absence of the gene (http://cise.sdust.edu.cn/labs/other/zhangyulin/2017/workingdata.rar).
Two thresholds, namely, firstorder and secondorder threshold are used to detect the connections among nodes in the gene logic networks. We obtain the structural features including the numbers of nodes and edges versus two thresholds in Table 3. The number and distribution of the two order logic types in the networks change with the thresholds. The degree of each node subsequently changes, as a result its indegree strength and outdegree strength will also change. With the increasing of threshold, the average degree of network nodes decreases. We try to analyze the relationship between robustness for cascading failure and network structural features such as degree and network motif under some thresholds.

By initially removing a gene node, failure cascades characterize the resultant cascade by its total number of other nodes deleted. After deleting node , the failure of nodes (including node ) and is an approximate indicator of network damage. The largest size ratio of cascading failure . Letwhere is a variable parameter. Then the cumulative probability of cascading failures is defined as , denoting the probability that the network’s cascading failures are larger than . The structural parameters defined above are used to measure the relationship between the network structure and the robustness of a network when successive failures occur. We focus on the key nodes that cause largescale cascading failures on the network, that is, the key failure nodes, which are related to the parameters such as firstorder threshold , secondorder threshold , and capacity parameter . Firstly, the capacity parameter plays an important role in maintaining robustness of the network. Let be a value from 0.1 to 0.9 with increment of 0.1. Figure 3 gives the change curves of largest size ratio of cascading failure versus the capacity parameter . Obviously, the smaller capacity parameter is, the more easily logical network is to fail in cascading failure.
If the thresholds are relatively small to zero, the connectivity of the network is very high. Not only is there no difference between the networks, but also the computational difficulty increases. While the thresholds are relatively large, lots of nodes in the network will be isolated. The selection of the thresholds is too large or too small not to conform to the practical biological significance. In the paper, four sets of thresholds at , , , and for three types of logical network are given to analyze the change of parameters for cascading failures. When we fix the parameter , then the corresponding cumulative distribution curve for each type under the thresholds is shown in Figure 4. Obviously, with increasing values of , reduces to zero. The distributions have a similar form for the types we studied: they are broadtailed, indicating that most cascades are small, while some are quite large. These large failures represent lethal events, so that the behavior of at large is of special interest. In fact, with the increasing of thresholds, more and more isolated nodes and smaller connected branches appear in the network. The connectivity of the network is reduced, and the integrity of the network structure has been seriously compromised.
4. Conclusions
In our model, each node in the network is initially deleted and then cascading failure spread over the entire network. We try to obtain these nodes which can lead to the larger scale cascading failure. Removing a node initially, the failure of these nodes will lead to the failure of other nodes in the network. The four genes CDH1, MYC, SOS2, and CDKN1A are obtained from the prostate cancer network. Similarly, five genes including TOP2A, REL, SHH, ROS1, and CHEK2 in colon cancer gene network and three genes including RBL1, MAPK9, and PIK3CA in adenocarcinoma of lung gene network are selected. Table 4 lists the gene nodes causing larger size cascading failure under all thresholds, where their indegrees and outdegrees are given. It can be found that the nodes that cause the largescale successive failures of the network are those nodes with larger indegree or outdegree. The nodes with larger degree are closely associated with other nodes. If they are deleted, the cascades spread throughout almost entire network. However, the nodes with larger degree do not necessarily lead to largescale cascading failures which are determined by the coupling relationship between nodes such as logical motifs.

The logic motifs are some doublets which are a combination of 2ndorder or 1storder logical relationships with at least one common node. In Figure 5, (a), (b), (c), and (d) show all possible secondorder logic doublets centered on node . Nodes (e) and (f) in Figure 5 give another logic doublets centered on node . These logic doublets are named according to the different positions of as “bothin,” “bothout,” and “inout” doublets. For example, (a) and (e) are “bothin” doublets for node ; therefore, node has only incoming edge but no outgoing edge, so its load . For (b), (c), and (f), node has only outgoing edge but no incoming edge. So the indegree strength of node is equal to zero and the outdegree strength of node is ; then its load . For (d), node has both incoming and outgoing edge, so its indegree and outdegree strengths are all greater than zero; hence, its load . If node closely connected to other nodes by logical motifs is deleted initially, then it would cause any other nodes to break down easily.
(a)
(b)
(c)
(d)
(e)
(f)
5. Discussion
In the study, we look into the propagation of cascading failures in gene logic network occurring from initial failure using one by one deletion strategy. A new model based on loadcapacity at nodes for cascading failure in the directed logic network is proposed. It attempts to explore the relationship between robustness and structure of the network. We apply the loadcapacity cascading failure method based on degree strength to gene expression profiles data from the NCBI for three types of cancer gene networks including adenocarcinoma of lung, prostate cancer, and colon cancer. We find that if the hubs are deleted, it will cause larger cascading failure. As such these nodes are possibly related to the occurrence and development of three types of cancers. Table 5 lists the genes and their gene annotations.

Some genes have been confirmed in the literature associating with corresponding cancer. For example, Cherfas [20] found that gene CHEK2 is closely related to the occurrence and development of colon cancer. Cai et al. [21] detected the expression of SHH gene in 38 surgical resection of colon cancer. The aberrant state of the SHH signaling pathway may be involved in the development of colon cancer. Gene TOP2A encodes DNA topoisomerase, which can be used as a target for many anticancer drugs, and many of its variants are closely related to the development of resistance. The MYC gene is a regulator gene that codes for a transcription factor. It is located on chromosome 8 and believed to regulate expression of 15% of all genes through binding on Enhancer Box sequences and recruiting histone. This means that in addition to its role as a classical transcription factor, MYC also regulates the global chromatin structure by regulating histone acetylation both in generich regions and at sites far from any known gene. Koh et al. [22] found MYC to be one of the top genes overexpressed in human prostate cancer tissues, as compared to matched normalappearing prostate tissue.
Baldi et al. [23] found that the expression levels of RBL1, a protein similar to that encoded by the gene pRb2, were negatively related to the histological stage and metastasis of lung tumors. Therefore, gene RBL1 is a tumor suppressor gene of lung cancer. Gene PIK3CA encodes an alpha subunit of the phosphatidylinositol 3kinase. Samuels and Velculescu [24] found high frequency variations of the PIK3CA gene in breast cancer and lung cancer. Most mutations are clustered in two locations in the PI3K helix or its catalytic role, and at least one hotspot mutation has increased kinase activity.
The paper proposed a loadcapacity cascading failure model based on the degree strength of nodes and identified the influence of cascading failures on the gene logic networks based on their gene expression profiles. By numerical experiment, the parameters in the cascading failure model on the networks were analyzed to obtain the relationship between network structure such as degree and cascading failure. Finally, we obtained some gene nodes leading the larger scale cascading failure on the networks under the thresholds. These genes may play an important role in the development or metastasis of cancer. Due to the limited operation, Rank sum test is used to determine significant difference gene sets at a significant level firstly and this will inevitably lose some genes related to the specificity cancer. In addition, the specific biological significance of these genes still needs further validation by biologists.
Disclosure
The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.
Conflicts of Interest
The authors declare no conflicts of interest.
Authors’ Contributions
Yulin Zhang conceived and designed the experiments; Kebo Lv performed the experiments; Xiao Lu analyzed the data; Maoxian Zhao contributed reagents/materials/analysis tools; Jionglong Su wrote the paper.
Acknowledgments
The research is supported by the National Natural Science Foundation of China (Grants 61370207, 61572522, 61503224, and 61773245), the National Natural Science Foundation of Shandong Province (Grant ZR2015FM014), and Qingdao Postdoctoral Research Project (2016110).
References
 B. A. Carreras, V. E. Lynch, I. Dobson, and D. . Newman, “Critical points and transitions in an electric power transmission model for cascading failure blackouts,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 12, no. 4, pp. 985–994, 2002. View at: Publisher Site  Google Scholar  MathSciNet
 R. Kinney, P. Crucitti, R. Albert, and V. Latora, “Modeling cascading failures in the North American power grid,” The European Physical Journal B, vol. 46, no. 1, pp. 101–107, 2005. View at: Publisher Site  Google Scholar
 J.W. Wang and L.L. Rong, “Robustness of the western United States power grid under edge attack strategies due to cascading failures,” Safety Science, vol. 49, no. 6, pp. 807–812, 2011. View at: Publisher Site  Google Scholar
 Z. Su, L. Li, H. Peng, J. Kurths, J. Xiao, and Y. Yang, “Robustness of interrelated traffic networks to cascading failures,” Scientific Reports, vol. 4, article 5413, 2014. View at: Publisher Site  Google Scholar
 J. J. Wu, H. J. Sun, and Z. Y. Gao, “Cascading failures on weighted urban traffic equilibrium networks,” Physica A: Statistical Mechanics Its Applications, vol. 386, no. 1, pp. 407–413, 2007. View at: Publisher Site  Google Scholar
 X. Guardiola, R. Guimera, A. Arenas, A. Guilera, D. Streib, and L. A. N. Amaral, “Macro and microstructure of trust network, Physics,” https://arxiv.org/abs/condmat/0206240. View at: Google Scholar
 A. H. Razavi, D. Anggraini, R. Missaoui, J. Vaillancourt, and M. Talbi, “Modeling and predicting cascading removal phenomenon over social networks,” Social Network Analysis and Mining, vol. 4, no. 1, article no. 233, pp. 1–14, 2014. View at: Publisher Site  Google Scholar
 C. Yi, Y. Bao, J. Jiang, and Y. Xue, “Modeling cascading failures with the crisis of trust in social networks,” Physica A: Statistical Mechanics and its Applications, vol. 436, pp. 256–271, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 A. G. Smart, L. A. N. Amaral, and J. M. Ottino, “Cascading failure and robustness in metabolic networks,” Proceedings of the National Acadamy of Sciences of the United States of America, vol. 105, no. 36, pp. 13223–13228, 2008. View at: Publisher Site  Google Scholar
 Y. Zhang, S. Wang, L. Sun, Y. Zhang, and D. Meng, “Analysis for cascading failures with dna repair function of gene network and its application,” Biomedical Engineering : Applications, Basis, and Communications, vol. 24, no. 03, pp. 237–244, 2012. View at: Publisher Site  Google Scholar
 A. E. Motter and Y. Lai, “Cascadebased attacks on complex networks,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 66, no. 6, Article ID 065102, 4 pages, 2002. View at: Publisher Site  Google Scholar
 X. F. Wang and J. Xu, “Cascading failures in coupled map lattices,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 70, no. 5, Article ID 056113, 2004. View at: Publisher Site  Google Scholar
 P. Crucitti, V. Latora, and M. Marchiori, “Model for cascading failures in complex networks,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 69, no. 4, Article ID 045104, 2004. View at: Publisher Site  Google Scholar
 X. Fang, Q. Yang, and W. Yan, “Modeling and analysis of cascading failure in directed complex networks,” Safety Science, vol. 65, pp. 1–9, 2014. View at: Publisher Site  Google Scholar
 W.X. Jin, P. Song, G.Z. Liu, and H. . Stanley, “The cascading vulnerability of the directed and weighted network,” Physica A: Statistical Mechanics and its Applications, vol. 427, pp. 302–325, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 P. M. Bowers, S. J. Cokus, D. Eisenberg, and T. O. Yeates, “Use of logic relationships to decipher protein network organization,” Science, vol. 306, no. 5705, pp. 2246–2249, 2004. View at: Publisher Site  Google Scholar
 P. M. Bowers, B. D. O'Connor, S. J. Cokus, E. Sprinzak, T. O. Yeates, and D. Eisenberg, “Utilizing logical relationships in genomic data to decipher cellular processes,” FEBS Journal, vol. 272, no. 20, pp. 5110–5118, 2005. View at: Publisher Site  Google Scholar
 S. Wang, Y. Chen, Q. Wang, E. Li, Y. Su, and D. Meng, “Analysis for gene networks based on logic relationships,” Journal of Systems Science & Complexity, vol. 23, no. 5, pp. 999–1011, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 Y. Zhang, K. Lv, S. Wang, J. Su, and D. Meng, “Modeling gene networks in saccharomyces cerevisiae based on gene expression profiles,” Computational and Mathematical Methods in Medicine, vol. 2015, Article ID 621264, 10 pages, 2015. View at: Publisher Site  Google Scholar
 J. Cherfas, “DNA damage response: capture gene gap in order to prevent cells worse,” Science Focus, vol. 2, no. 3, pp. 60–66, 2007. View at: Google Scholar
 Q. H. Cai, H. J. Tu, and J. H. Yu, “Expression of SHH and GLI1 in colon cancer and its clinical significance,” Chinese Journal of Medical Science, vol. 9, pp. 12–14, 2011. View at: Google Scholar
 C. M. Koh, C. J. Bieberich, C. V. Dang, W. G. Nelson, S. Yegnasubramanian, and A. M. De Marzo, “MYC and prostate cancer,” Genes & Cancer, vol. 1, no. 6, pp. 617–628, 2010. View at: Publisher Site  Google Scholar
 A. Baldi, V. Esposito, A. De Luca et al., “Differential expression of Rb2/p130 and p107 in normal human tissues and in primary lung cancer,” Clinical Cancer Research, vol. 3, no. 10, pp. 1691–1697, 1997. View at: Google Scholar
 Y. Samuels and V. E. Velculescu, “Oncogenic mutations of PIK3CA in human cancers,” Cell Cycle, vol. 3, no. 10, pp. 1221–1224, 2004. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Yulin Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.