Abstract

During the last years, proteomic studies have revealed several interesting findings in experimental sepsis models and septic patients. However, most studies investigated protein alterations only in single organs or in whole blood. To identify possible sepsis biomarkers and to evaluate the relationship between protein alteration in sepsis affected organs and blood, proteomics data from the heart, brain, liver, kidney, and serum were analysed. Using functional network analyses in combination with hierarchical cluster analysis, we found that protein regulation patterns in organ tissues as well as in serum are highly dynamic. In the tissue proteome, the main functions and pathways affected were the oxidoreductive activity, cell energy generation, or metabolism, whereas in the serum proteome, functions were associated with lipoproteins metabolism and, to a minor extent, with coagulation, inflammatory response, and organ regeneration. Proteins from network analyses of organ tissue did not correlate with statistically significantly regulated serum proteins or with predicted proteins of serum functions. In this study, the combination of proteomic network analyses with cluster analyses is introduced as an approach to deal with high-throughput proteomics data to evaluate the dynamics of protein regulation during sepsis.

1. Introduction

Proteomic studies and broad analyses of protein alterations in experimental and clinical sepsis allow evaluating the systemic host response to a hit or injury and offer comprehensive information about the complex host response to infection [1]. Compared with genetic analyses, proteomics can give direct insight into protein expression and not only in an indirect way as would be possible by studying gene regulation. As secreted proteins are signalling systems which convert genetic signals into enzymatic activity, it might be advantageous to study protein alterations [1]. Experience with proteomics in sepsis has already revealed several interesting findings in both experimental models and septic patients [2]. A small clinical study in septic shock patients, for example, demonstrated that proteomic analysis is a feasible tool to exclude early alterations in protein expression and that there are specific protein alterations between survivors and nonsurvivors in an early stage of septic shock [3].

Furthermore, other proteomic studies identified peptides as possibly useful sepsis biomarkers [4, 5]. In a septic mouse animal model, dynamic changes of tissue-specific septic protein profiles in blood plasma could be detected using proteomic analysis [6]. Most studies investigated protein alterations only in single organs or in whole blood [1]. However, sepsis represents a continuum ranging from simple infection and bacteraemia to life threatening septic shock with multiple organ dysfunction. As sepsis is a highly dynamic process, protein alterations may vary during the different stages of the disease [7]. Compared with genetic analyses, proteomics can give direct insight into protein expression and not only in an indirect way as would be possible by studying gene regulation.

To identify possible protein regulation patterns and to evaluate the interaction between protein alteration in sepsis affected organs and blood, proteomics data from the heart, brain, liver, kidney, and serum from previous studies was analysed [812].

Modern technologies make it possible to identify and quantify a large amount of different proteins in proteomic experiments. Thus, big data analyses have become a bottleneck and represent a great challenge in proteomics [13]. In this study, protein network analyses in combination with cluster analyses are described as a possible approach to deal with high-throughput proteomic data.

2. Material and Methods

2.1. Experimental Sepsis Model and Proteomic Data from Previous Studies

In five previous studies, male Wistar rats were randomly assigned to a sepsis group (cecal ligation and puncture, CLP) or a control group (sham) [811, 14]. Surviving rats were sacrificed 12, 24, or 48 hours after sepsis induction. Organs and serum were removed after decapitation and prepared for proteomic analysis. Proteins were separated using 2D gel electrophoresis (2D-DIGE). Each spot in the 2D-DIGE was matched to a corresponding spot in a reference gel, which was created as a virtual PC-generated averaged gel. Then, normalized spot volumes between the sham and sepsis groups at each respective time point were compared. In addition to a statistically significant difference between the spots, differences were considered biologically relevant if the protein expression factors (induction factor [IF]) changed more than twofold (IF < 0.5 or IF > 2). This helps ensure that regulated expression has actual biological significance, thus making the changes more likely to affect cellular functions. The IF in relation to the sham group was calculated by dividing the mean normalized spot volumes in both groups. A value of 2.0 therefore indicates a twofold increase, and a value of 0.5 indicates a twofold decrease.

Significantly altered proteins were identified by mass spectrometry (MALDI-TOF MS) and used for further bioinformatical analysis to identify underlying networks, signalling cascades, and pathways affected.

2.2. Stepwise Bioinformatical Approach

In summary, as a first step, statistically significantly regulated proteins from blood and organ tissues of previous studies were identified and analysed by network analyses (GeneMania®). Afterwards, those statistically significant proteins were grouped (12, 24, and 48 hrs) using a hierarchical cluster analysis (Perseus®). As a third step, proteins of similarly early upregulated clusters underwent further network analysis to evaluate possible corresponding proteins or functions in blood and organ tissues. This approach to deal with pooled proteomic data is described in detail below.

2.3. Network Analysis of Proteins (GeneMania)

Sixty proteins from sepsis related organs (liver, kidney, heart, and brain) and twenty proteins from a serum analysis which were significantly altered, at least at one time point (12, 24, and 48 hours), were used for further bioinformatical analysis to identify underlying networks, signalling cascades, and pathways affected.

Biological functions of statistically significantly regulated proteins were identified using functional network analysis. GeneMania (http://www.genemania.org/) is a tool that helps predict interactions and function of genes in terms of network and, when available, of pathway [15, 16]. It gives the possibility of customizing the network and allows choosing data sources or highlighting specific functions, with a more comfortable graphic experience [15]. It is developed and continually updated by the University of Toronto and is funded by the Ontario Ministry of Research and Innovation. GeneMania knowledge is based on data from large databases, which comprehend Gene Expression Omnibus, BioGRID, EMBL-EBI, Pfam, Ensembl, Mouse Genome Informatics, the National Center for Biotechnology Information, InParanoid, and Pathway Commons [15, 16]. It was developed for making predictions about gene or protein function based on a query of list of proteins that share a function of interest. The software allows taking advantage of the persistent improvement and proliferation of high-throughput genomics and proteomics datasets by making up-to-date predictions of their interaction with other genes or proteins [15, 16].

As these software programs use different algorithms, we decided to perform the bioinformatical analyses with all of them in order to retrieve the highest number of predicted interactions, maintaining an acceptable level of confidence (0.400).

The associated functions detected by the software were downloaded in TAB-separated-values format and exported to Microsoft Excel® (Microsoft, Redmond, USA; version 2007) where they were filtered in subgroups which were reanalysed using GeneMania.

2.4. Hierarchical Cluster Analysis

Heat maps are an efficient method of visualizing complex datasets organized as matrices [17]. Perseus (Max Planck Institute of Biochemistry, Martinsried, Germany; v. 1.5.8.5) is a holistic software platform that allows continuous expansion of scalable analytical tools, their smooth integration, and reusability while providing the user with explicit documentation of the analysis steps and parameters [18]. Quantitative information concerning proteins that had statistically significant altered expression at 12, 24, and 48 hours from the induction of sepsis was converted to TSV (Tab-Separated Values) text file using Microsoft Excel (Microsoft, Redmond, USA; version 2007). Each value was reported as fold change in comparison to the sham group values, so that a positive number represents a higher expression of a spot at 12, 24, or 48 hours while a negative number represents lower expression of a spot at 12, 24, or 48 hours. In this format the data were analysed using the free software Perseus (Max Planck Institute of Biochemistry, Martinsried, Germany; v. 1.5.8.5) which performed the -scoring and, consequently, the hierarchical cluster analysis. The resulting heat map can be interpreted on the basis of colour intensity. In our case, a red brick represents a protein whose expression at a particular time was increased when compared to the value of the same protein in the sham group at that time.

2.5. Identification of Regulation Pathways and Biomarker Candidates

On the basis of the cluster analysis, further subgroup network analyses of similarly upregulated proteins at 12 hours or 12 and 24 hours after sepsis induction in sepsis related organs (liver, kidney, heart, and brain) were performed to find regulation patterns and identify possible biomarkers.

3. Results

3.1. Biologically Statistically Significantly Regulated Proteins

Collecting data from the 5 previous studies [812], 80 statistically significant altered proteins (a total of 113 total spots) from sepsis related organs and serum were identified.

Using GeneMania, separate network analyses regarding serum proteins (Figure 1) and regarding sepsis related organs (liver, kidney, heart, and brain) (Suppl. Figure 1) were performed. The detected organ-related functions were subsequently filtered. From the original 159 functions, we found 38 functions filtered for prevalence (arbitrary cutoff at 12%) (Table 1) and 51 functions filtered by absolute number (cutoff ≥ 7) (Suppl. Table 1).

Most of the functions were associated with oxidoreductive activity and cell energy generation or metabolism (ATP production, tricarboxylic metabolism, glycolysis, gluconeogenesis, cell respiration, etc.) and nucleotide or nucleoside metabolism. One-third of the proteins found are usually located in the mitochondria.

The functions identified with statistically significant altered serum proteins using 2% as cutoff for prevalence are shown in Table 2. Functions identified with 6 for absolute numbers as cutoffs are shown in Suppl. Table 2. Most of the functions for the serum proteins were associated with lipoproteins metabolism and, to a minor extent, with coagulation, inflammatory response, and organ regeneration.

3.2. Hierarchical Cluster Analyses and Heat Maps

Quantitative information concerning proteins that had statistically significant altered expression in the liver, kidney, heart, and brain at 12, 24, and 48 hours from the induction of sepsis was analysed using Perseus (Max Planck Institute of Biochemistry, Martinsried, Germany; v. 1.5.8.5) which performed the hierarchical cluster analysis (Figure 2).

The cluster analysis revealed several groups of regulation patterns with different combinations of proteins up/downregulated or unchanged at different time points. Three subclusters of similarly upregulated proteins at 12 or 12 and 24 hours were identified. Since these early upregulated subclusters may contain possible candidates for sepsis biomarkers, further network analyses were conducted for these subgroups highlighted in Figure 3.

In the same way, a cluster analysis of statistically significantly regulated serum proteins was performed (Figure 3). In this analysis, two subgroups of upregulated proteins at 12 hours or 12 and 24 hours could be identified. Comparing likewise regulated proteins from sepsis related organs and serum, no concordance of proteins could be detected.

3.3. Network Analyses of Similarly Regulated Proteins

The subclusters of similarly up- and downregulated proteins in the first 24 hours after sepsis induction for both sepsis related organs and serum underwent further GeneMania analyses to identify networks and predicted proteins within these networks and their associated functions. By identifying predicted proteins, we expected a higher likelihood of finding statistically significantly regulated proteins both in organ tissues and in serum. For subcluster 1 in the organ tissue cluster analysis, we found no network using GeneMania.

The network for subcluster 2 revealed 19 functions filtered by absolute number (cutoff ≥ 5) and 17 functions filtered by prevalence (cutoff ≥ 10%) (Suppl. Tables 3 and 4). Most of the functions in this subcluster were related to oxidoreductive activity and cell energy generation and metabolism, like in the unselected group of the whole tissues. None of the predicted proteins within the functions and pathways was previously found in serum.

Using the same cutoff values in subcluster 3, 27 functions filtered by absolute number and 20 functions filtered by prevalence were found (Suppl. Tables 5 and 6). Most of the functions regarding this subcluster were related to energy generation and metabolism and to muscle contractile function (heart), and nucleoside metabolism. Similar to subcluster 2, none of the predicted proteins within the functions and pathways was previously found in the serum.

In serum proteins, a network analysis of subcluster 1 (Figure 3) with an upregulated group at 12 hours (C3, Apoa1, Kng2, Dpysl2, and Igh-6) was not possible because the number of proteins involved was too low. Therefore, subclusters 1 (see above) and 2 (Hp, Alb, Apoa1, Kng2, Tf, Gc, Apoe, and Cfb) were analysed together and most of the functions were related to lipid metabolism or lipid transport and to a lower extent associated with immune response (Suppl. Tables 7 and 8).

4. Discussion

In this study, proteomic data of various experiments all using the same experimental sepsis model (i.e., cecal ligation and puncture, CLP) were analysed using bioinformatical methods to identify protein regulation patterns altered by sepsis [812]. To the best of our knowledge, this is the first study which compares proteomic data from a broad set of organs during sepsis to associated protein regulation patterns and pathways in serum. Furthermore, we used protein network analysis in combination with hierarchical cluster analysis to deal with large proteomic data. The combination of cluster analysis and network analysis is well established in proteomic studies. However, so far, this approach was not described in an animal model to analyse septic induced protein alterations at different time points.

4.1. Functions of Significantly Regulated Proteins

The study reveals several major findings. By using protein network analysis software (GeneMania), we demonstrated that most of the statistically significantly regulated proteins from the heart, liver, kidney, and brain were associated with oxidoreductive activity, cell energy generation or metabolism (ATP production, tricarboxylic metabolism, glycolysis, gluconeogenesis, cell respiration, etc.), and nucleotide or nucleoside metabolism (Table 1 and Suppl. Table 2). Most of the functions of statistically significantly regulated serum proteins were related to lipoproteins metabolism and, to a minor extent, to coagulation, inflammatory response, and organ regeneration.

It appears plausible that in the clinical setting of sepsis there is an alteration of proteins involved in energy generation in tissues since an imbalance between oxygen delivery and consumption is a hallmark of sepsis and particularly septic shock [19]. Therefore, it is conceivable that expression of proteins related to energy generation might be a compensatory mechanism to account for intracellular hypoxia. Previous studies also showed an association between sepsis and organ failure. Future studies could investigate whether this is a feature specific to sepsis and septic shock or common to different causes of shock and hypoxia.

Concerning lipoprotein expression, which was found to be altered in serum in our study, there is an evolving interest in the use of lipoproteins, especially high-density lipoprotein, both as a biomarker [20, 21] and as a potential therapeutic approach in sepsis [21, 22].

4.2. Course of Protein Alterations

Hierarchical cluster analysis confirmed that protein regulation in sepsis related organs and tissues underlies a dynamic process. We found that proteins can be up- or downregulated or even remain unchanged at different time points (12 hours, 24 hours, or 48 hours) after induction of sepsis. Regarding the early phase of sepsis, that is, up to 24 hours after sepsis induction, three subclusters of organ proteins were identified which were upregulated at 12 or at 12 and 24 hours (Figure 3). Subclusters were defined based on the hypothesis that statistically significantly upregulated proteins in organ tissues can probably be found simultaneously in blood. We focused on the early phase of sepsis up to 24 hours after sepsis induction as from a clinical point of view, a timely diagnosis of sepsis is crucial. Proteins of these subclusters in principle could be candidates for early sepsis biomarkers if detected in blood.

4.3. Congruency in Regulation between Tissue and Serum

Another major finding of our analysis was that proteins in early upregulated subclusters of the serum (Figure 3) did not correspond to tissue proteins of different organs analysed. Also, functions in upregulated subclusters in serum identified by GeneMania network analysis did not correspond to functions in early upregulated organ tissue clusters. In serum, functions were related to lipoprotein metabolism and, to a minor extent, to coagulation, inflammatory response, and organ regeneration, whereas in organ tissues most functions were associated with energy generation and metabolism and with muscle contractile function (heart) and nucleoside metabolism (Suppl. Tables 5 and 6). Finally, predicted proteins from network analyses of organ tissue did not correlate with significantly regulated serum proteins or with predicted proteins of serum functions.

4.4. Evaluation of the Bioinformatical Approach

In our bioinformatical analysis we sought to assess if the dynamic process of sepsis associated alterations in tissue proteome is reflected in serum proteome changes. Several subclusters of early upregulated tissue proteins could be detected, which are possible interesting candidates as sepsis biomarker, if detected in blood. Furthermore, functions and pathways in organ tissues associated with early upregulated protein clusters could be compared to altered functions in blood. However, none of the tissue proteins was found in the serum and, moreover, even none of the predicted proteins from the GeneMania network functions correlated with serum proteins. Even though no identical proteins were detected in the serum as well as in the organ tissues, our bioinformatical approach could be helpful for our understanding of the pathophysiology of sepsis. For example, the cluster analyses revealed which proteins and functions were regulated at different stages during the course of sepsis. Furthermore, one-third of statistically significantly regulated proteins can be found in the mitochondria, underlining the importance of alteration of mitochondrial functions and even mitochondrial damage in the host response to sepsis [12, 2325].

Even though no common protein was found in the serum as well as in organ tissue, this does not necessarily mean that the detected proteins might not be potential candidates of sepsis biomarkers. Probably, the organ-related proteins were not found in the serum because they were under the detection limit and more sensitive techniques are needed. By using network analyses we were able to predict proteins possibly involved in functions and pathways of upregulated clusters. As a result of this, the number of possible candidates for biomarkers could be increased. The detection of a single protein or a set of proteins, upregulated in organ tissue as well as in serum, would implicate further research in those proteins.

In blood plasma, numerous tissue proteins can be found. However, most of them do not contribute to the genuine blood plasma functions [26]. Currently, there is limited knowledge on the regulation of the blood plasma proteome and it is unknown to what extent various tissues can affect blood plasma protein composition in sepsis [6].

In a recent septic mouse model, the authors introduced an MS-based strategy to monitor the dynamics of tissue and cell-specific proteins in the blood plasma and constructed a proteome-wide tissue atlas to demonstrate how the surrounding tissue and cells influence the blood plasma in severe infectious diseases [6]. In their study, only one single time point at 48 hours after sepsis induction was analysed, whereas in our study we sought to identify early time-dependent correlations between changes in organ and blood proteome using a hierarchical cluster analysis. Hierarchical cluster analysis turned out to be useful in both detection of possible biomarkers and protein regulation patterns in clinical or experimental sepsis research [2730].

In a recent review article the authors stated that “in case of the proteomic investigation, the challenges occur at all levels ranging from sample preparation and data gathering over the raw data integration and database searching to the functional interpretation of large datasets” [31]. Thus, our bioinformatical analysis might be a promising method of how to deal with large proteomic data and complex interactions and functions. In future, proteomic techniques will steadily improve and data quantities will increase. Thus, new methods are required, helping us to interpret these results. In this context, our study should be rather interpreted as hypothesis generating rather than definitive. Nonetheless, there are no current standards on settings or cutoff levels for network analysis software. For the present study, we used default settings of the network analysis software. Cutoffs for proteins and functions were defined arbitrarily only to find a pragmatical balance between finding relevant sepsis related functions and eliminating nonspecific proteins. Of course, these settings and cutoffs might have influenced the results in our analysis and further analyses should aim for this. Likewise, protein clusters in our study were defined from a clinical point of view. Future studies should evaluate the most appropriate selection algorithms and software settings and should compare different network analysis software programs. Nonetheless, every study is unique and software settings and cutoffs also depend on the type of analysis and the hypotheses.

4.5. Limitations

Some limitations of our study have to be mentioned. Statistically significantly regulated tissue proteins from different organs were mixed in the network analyses. Thus, we cannot be sure that the derived functions and pathways in fact correspond to these functions in the respective organs. However, the previous organ proteomics analyses of this sepsis model confirm that most of the functions are associated with energy metabolism, mitochondrial function, and lipid metabolism [812].

The number of functions presented in this analysis was limited by using arbitrary cutoffs for prevalence and the absolute number of proteins involved in the network. By this, functions were identified in which only a representative number of proteins was present.

Interestingly, we found no typical acute phase proteins in our analysis. This probably depends on the technical limitations of proteomic analyses. As common inflammation biomarkers are relatively small proteins and concentration even after upregulation might be low, this could explain why those typical proteins were missed in our analysis. With further advances in proteomic techniques and more sensitive methods, small and low concentrated proteins might also be detected in future.

5. Conclusion

In summary, in our stepwise comparison of dynamic organ tissue proteome changes to serum proteome changes we were able to demonstrate that regulation patterns in organ tissues as well as in serum are highly dynamic. Subclusters of proteins can be upregulated or downregulated or even remain undifferentiated at different stages of sepsis. The main functions and pathways affected in the tissue proteome were oxidoreductive activity, cell energy generation, or metabolism, whereas in the serum proteome, functions were associated with lipoproteins metabolism and, to a minor extent, with coagulation, inflammatory response, and organ regeneration. Using hierarchical cluster analyses and functional network analyses (GeneMania) including predicted network proteins, we were not able to detect correlating proteins or functions in organ tissues and blood. Furthermore, we were not able to identify promising candidates for sepsis biomarkers. Nonetheless, this analysis provides new insights into protein regulation during sepsis and this bioinformatical approach could be helpful to deal with high-throughput proteomic data.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Authors’ Contributions

Andreas Hohn and Ivan Iovino equally contributed to the manuscript.

Supplementary Materials

Suppl. Table 1: fifty-one functions filtered by absolute number (cutoff ≥ 7) from the original 159 functions derived from GeneMania network analysis of the whole dataset without the serum proteins. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 2: network analysis serum functions absolute number. Thirty-three functions filtered by absolute number (cutoff ≥ 6) from the original 166 derived from GeneMania network analysis of the serum-protein dataset. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 3: subcluster 2 (gpd1, eno1, aldh5a1, coro1a, atp6v1b2, ckb, alb, fasn, acy1, fbp1, fscn1, aldh7a1, cct3, gpd1, ogdh, oxct1, and ca1). Seventeen functions filtered by prevalence (cutoff ≥ 10%) from the original 51 functions derived from GeneMania network analysis of the whole dataset without the serum proteins. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 4: subcluster 2 (gpd1, eno1, aldh5a1, coro1a, atp6v1b2, ckb, alb, fasn, acy1, fbp1, fscn1, aldh7a1, cct3, gpd1, ogdh, oxct1, and ca1). Nineteen functions filtered by absolute number (cutoff ≥ 5) from the original 51 functions derived from GeneMania network analysis of the whole dataset without the serum proteins. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 5: subcluster 3 (gapdh, cps1, aldoa, glul, myh6, myh7, oplah, got1, and acss1). Twenty functions filtered by prevalence (cutoff ≥ 10%) from the original 90 functions derived from GeneMania network analysis of the whole dataset without the serum proteins. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 6: subcluster 3 (gapdh, cps1, aldoa, glul, myh6, myh7, oplah, got1, and acss1). Twenty-seven functions filtered by absolute number (cutoff ≥ 5) from the original 90 functions derived from GeneMania network analysis of the whole dataset without the serum proteins. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 7: subcluster of similarly upregulated proteins from the serum-protein dataset (c3, kng2, dpysl2, igh-6, apoa1, hp, alb, tf, gc, apoe, and cfb). Forty-four functions filtered by prevalence (cutoff ≥ 15%) from the original 190 functions derived from GeneMania network analysis of this subcluster dataset. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Table 8: subcluster of similarly upregulated proteins from the serum-protein dataset (c3, kng2, dpysl2, igh-6, apoa1, hp, alb, tf, gc, apoe, and cfb). Fifty-nine functions filtered by absolute number (cutoff ≥ 5) from the original 190 functions derived from GeneMania network analysis of this subcluster dataset. Column 1 shows the functions names. Columns 2 and 3 show, respectively, the number of annotated genes in the displayed network and the number of genes with that annotation in the genome. In column 5, names in bold letters represent the genes predicted by the software. Suppl. Figure 1: network analysis organs without serum. In a GeneMania network analysis, each circle represents a gene. The input proteins/genes are depicted as striped circles of the same size, while the monochromatic circles, whose size is proportional to the number of interactions according to the software, can be considered as “relevant” related genes found by GeneMania searching in many large, publicly available biological datasets (including protein-protein, protein-DNA, and genetic interactions, pathways, reactions, gene and protein expression data, protein domains, and phenotypic screening profiles). Lines linking different circles can be distinguished from their colour; mainly violet represents coexpression (when expression levels are similar across conditions in a gene expression study); light orange represents predicted functional relationships between genes; light blue represents colocalization (when genes are expressed in the same tissue or proteins found in the same location); light yellow represents shared protein domains (when two gene products have the same protein domain). (Supplementary Materials)