Abstract

The development of new methods to identify influential spreaders in complex networks has been a significant challenge in network science over the last decade. Practical significance spans from graph theory to interdisciplinary fields like biology, sociology, economics, and marketing. Despite rich literature in this direction, we find small notable effort to consistently compare and rank existing centralities considering both the topology and the opinion diffusion model, as well as considering the context of simultaneous spreading. To this end, our study introduces a new benchmarking framework targeting the scenario of competitive opinion diffusion; our method differs from classic SIR epidemic diffusion, by employing competition-based spreading supported by the realistic tolerance-based diffusion model. We review a wide range of state-of-the-art node ranking methods and apply our novel method on large synthetic and real-world datasets. Simulations show that our methodology offers much higher quantitative differentiation between ranking methods on the same dataset and notably high granularity for a ranking method over different datasets. We are able to pinpoint—with consistency—which influence the ranking method performs better against the other one, on a given complex network topology. We consider that our framework can offer a forward leap when analysing diffusion characterized by real-time competition between agents. These results can greatly benefit the tackling of social unrest, rumour spreading, political manipulation, and other vital and challenging applications in social network analysis.

1. Introduction

Estimating node influence can lead to an improved understanding of the natural interaction patterns within real-world populations, biological entities, or technological structures. The applicability of metrics for quantifying the influence potential of nodes has wide-ranging interdisciplinary applications including disease modelling [17], information transmission [811], behavioural intelligence [3, 1215], business management [16, 17], finances [18, 19], and pharmacology and drug repurposing [20, 21]. Being able to correctly determine and rank influential nodes in empirical networks can have direct applicability in problems like impeding epidemic outbreaks [22], accelerating innovation diffusion, evaluating marketing and financial trends [18], discovering new drug targets in pathway networks [20], and predicting essential proteins in protein interaction networks and gene regulatory networks [23]. Regardless of the context, the most common way to capture information on intricate real-world interactions is a complex network [2427]. Specifically, social network analysis (SNA), as a subdomain of network science, models social structures characterized by emergent interaction.

There is considerable effort devoted to assessing the importance of nodes in many types of complex networks over the last decade. Novel approaches, combined with classic graph centrality measures, have led to the emergence of the three main categories of influence ranking methods. The first category of scientists argues that the location of a node is more important than its immediate ego network and thus proposed -core decomposition [28, 29], along with improved variants, such as [3033]. The second category of scientists quantifies the influence of a node based solely on its local surroundings [3436]. Finally, the third category of scientists evaluates node influences according to various states of equilibrium for dynamical processes, such as random walks [37, 38] or step-wise refinements [39].

Each ranking method, regardless of its nature and category, is validated through a state-of-the-art benchmarking methodology, which—in almost all cases of network science—involves the usage of the SIR epidemic model [4042]. This process may be suitable for validating metrics in an individual context in order to produce a verdict whether the ranking method is good enough, but often not more. For SNA, however, collective interplay is inherent [43] and the aforementioned real-world application contexts imply competition between multiple opinions, so a one-sided perspective will often not be reliable. The recent study shows that the traditional SIR model provides a poor description of the data for modelling disease dynamics, as it lacks infectious recovery dynamics, which is a better description of social network dynamics [44]. Consequently, we consider that the SIR model would be inadequate to apply in our benchmarking context, as it fails to model competition and opinion fluctuations. As such, we propose a more robust benchmarking principle that implies simultaneous competition between two or more information (opinion, rumour) sources, that is, in the same network and at the same time. To this end, we make use of the existing tolerance-based diffusion model [45], which represents, to the best of our knowledge, a novel benchmarking methodology in SNA.

To better underline the limitations incurred by using a SIR simulation versus our proposed competition-based benchmark, we illustrate a comparative example in Figure 1. In (a) and (b), we apply two distinct ranking methods (orange and blue), one at a time, and show that the diffusion process is unrestrained, also we suggest that orange manages to cover the network in time , faster than blue with , due to the higher dispersion of three initial orange opinion sources. In the SIR context, the two simulations may lead to the conclusion that the orange ranking method is better than the blue one. In reality, we consider the scenario in (c) as the more probable one. Opinions will diffuse simultaneously and face constraints due to competition over each node (i.e., orange and blue exclude one another). In this case, we intuitively suggest that blue might win in terms of network coverage, as it has a tighter initial cluster forming around its three opinion sources. Consequently, the main observations are the following: (i)None of the two opinions will achieve coverages as high as in the one-sided scenarios, that is, and .(ii)Simulation time may be longer than , due to the need for attaining a state of balance in the emergent network.(iii)The final ratio of opinion / is impossible to determine by one-sided simulations and is only determinable by the emergence of the two competing opinions (e.g., initial spreader position, connectivity of the spreaders, and community structure).

In light of these remarks, we propose a novel benchmarking framework which offers more reliable insights into comparing ranking methods aimed at real-world applications of social networks. The paper starts by presenting the benchmarking methodology in detail, followed by simulation results. We highlight the overlapping of several popular ranking methods, in terms of selecting the same initial seeds, then proceed to compare the ranking methods using SIR as a reference and then in pairs (one versus one) using our proposed methodology. Finally, we discuss the results, the difference in what our testing methodology can offer, and what are the implications of considering competing opinion. The Methods section details the used validation datasets and a brief review of current state-of-the-art ranking measures used in complex networks.

2. A Novel Competition-Based Influence Ranking Benchmark

State-of-the-art benchmarking methodologies for spreading processes on complex networks often rely on the SIR (SIS) model [4042]. With this approach, an initial subset of nodes is infected according to a centrality measure, then the simulation measures how fast surrounding susceptible nodes become recovered (i.e., including dead). Indeed, if we take the example of an epidemic, it spreads independently from other epidemics and has its own temporal evolution. On the other hand, if we consider opinion between social agents, it is often exclusive (in regard to other contradicting opinions) and is also dependent on the timing with the spread of other ideas.

We argue that a SIR model cannot accurately model fluctuations and direct competition between social agents. Also, as long as the infected nodes survive, they will eventually tamper with the whole network. Finally, the SIR model is sensitive to initial parameters, like infectious probability and recovery duration , needing step-wise refinements to obtain desired results, which may vary easily in other experimental settings. Alternatively, we find several variants of the SIR model designed for competitive diffusion processes, such as the [5], [6], and [7] models, but they are targeting competitive epidemic diffusion.

As a novel, more robust, and more realistic alternative, we propose the usage of the tolerance-based model [45] which implies competition between two or more opinion sources in the same network, at the same time. To the best of our knowledge, this kind of benchmarking methodology is novel to literature. Other graph-based predictive diffusion models [46] include the classic linear threshold LT [47], independent cascade IC [48], voter model [49], Axelrod model [50], and Sznajd model [51]. These models use either fixed thresholds or thresholds evolving according to simple probabilistic processes that are not driven by the internal state of the social agents [46]. However, the tolerance model is the first opinion diffusion model to propose a truly dynamic threshold (i.e., a node’s state evolves according to the dynamic interaction patterns). Therefore, based on its novelty and realism potential, we are encouraged to use the tolerance model in our paper.

2.1. The Tolerance-Based Opinion Diffusion Model

The tolerance model [45] is based on the classic voter model [49], being a refinement of the stubborn agent model [11, 52], with the unique addition of a dynamic decision-making threshold, called tolerance , for each node.

We further introduce the specific network science notations to mathematically define our model. Given a social network , the neighbourhood of node is defined as . Exemplifying for a context with two competing opinions, we introduce two disjoint sets of stubborn agents which act as opinion sources. Stubborn agents never change their opinion, while all other (regular) agents update their opinion based on the opinion of one or more of their direct neighbours. We represent with the opinion of agent at time . Normal (regular) agents start with a random opinion value . We represent with the state of an agent at moment having continuous opinion . In case of a discrete opinion, representation , and in case of a continuous opinion, representation is given in the following equation.

In the assumed social network, agents and are neighbouring nodes if there is an edge that connects them. Some agents may not have an opinion or may not participate in the diffusion process (i.e., ), so interacting with these agents will generate no opinion update. A regular node will periodically poll one random neighbour (simple diffusion) or all its neighbours (complex diffusion), average the surrounding opinion (i.e., vicinity of an arbitrary node , at time point ), and update its opinion using a weighted combination of the past opinion and that of its neighbour(s), as

The tolerance parameter is the amount of accepted external opinion and changes after each interaction based on whether a node has faced competing opinion or supporting opinion (in a binary context with opinions and ). Once a node is in contact with the same opinion for a long enough time, it becomes intolerant (), so that the network converges towards a state of balance [53]. Opinion fluctuates and is transacted by all nodes, but stubborn agents are the only nodes which do not become influenced in turn, acting as perpetual sources for the same opinion [11].

The evolution towards both tolerance and intolerance varies in a nonlinear fashion, as an agent under constant influence becomes indoctrinated at an increased rate over time. If that agent faces an opposing opinion, he will eventually start to progressively build confidence in the other opinion. As such, the tolerance model employs a nonlinear fluctuation function, unlike most models in literature [54, 55]. Based on realistic sociopsychological considerations in the dynamical opinion interaction model, we model tolerance evolution as

Tolerance is decreased by if the state of the agent before interaction, , is the same as the state of the randomly interacting neighbour . If the states are not identical (i.e., opposite opinion), then the tolerance will be increased with the dynamic product of . The two scaling factors, and , both initialized with 1, act as weights (i.e., counters) which are increased to account for every event in which the initiating agent keeps its old opinion (i.e., tolerance decreasing) or changes its old opinion (i.e., tolerance increasing). Therefore, scaling factor is increased by +1 as long as an agent interacts with another agents having the same state (i.e., ) and is reset to 1 otherwise. Scaling factor is increased as long as the interacting state is always different from that of the agent and is reset if the states are identical. We introduced the scaling factors to model bias and used to increase the magnitude of the two tolerance modification ratios (intolerance modifier weight) and (tolerance modifier weight). The two ratios are chosen with the fixed values of and . We have determined these values as explained in [45].

In accordance with this presented mechanism, we designate two sets of stubborn agents, and , to act as initial spreaders simultaneously. In other words, we let all chosen centrality metrics compete against each other in a one-to-one diffusion scenario, where sets and consist of the top spreaders selected by each two pairs of centralities. We ensure that and , with . We find this approach to offer a good qualitative comparison basis for estimating the effectiveness of node ranking methods.

2.2. Alternate Opinion Assigning Approach

We further find that most state-of-the-art ranking methods have various degrees of overlapping in terms of the top spreader nodes they assign. As such, we introduce an alternate opinion assigning (AOA) approach in order to distribute nodes in the two sets of spreaders and evenly and equitable for both ranking methods, say and . Figure 2 exemplifies the AOA approach, where ranking method is depicted with orange and method is depicted with blue.

AOA means that each one-to-one influence ranking benchmark consists of two (or multiple of two) independent simulations. Considering that ranking methods and produce two partially overlapping sets of top spreaders, we alternate the simulations as follows: (i)In the first simulation, method (orange) has priority: one starts by assigning the first (top 1) spreader from as an orange stubborn agent. This implies that the spreader remains in and is removed from , if present.(ii)Then, the first spreader from is assigned as a blue stubborn agent, removing it from , if present.(iii)Alternatively, we assign nodes alternative opinion and filter them out from the other list of spreaders.(iv)The AOA stops when and discards any extra node so that , ensuring that both sets and have an equal number of stubborn agents, namely, half of the desired spreader population.(v)In the second simulation, method (blue) has priority: one starts by assigning the first (top 1) spreader from as a blue stubborn agent. This implies that the spreader remains in and is removed from , if present.(vi)The exact same AOA process is repeated, with having priority over .

The impact of AOA is highlighted in Figure 2, as we end up assigning two significantly different spreader sets for methods and . Methodologically speaking, one benchmark must consist of at least two simulations, but for better experimental results, one may run simulations, ensuring that AOA is applied (i.e., simulations favouring method and simulations favouring method ).

3. Results

We set out to discover fundamental drivers in the underlying graph structure which shape and influence opinion spreading in complex networks. To this end, our experimental setup is focused on a comparative benchmark analysis involving the reviewed node centrality metrics defined in Section 5.2. For an objective comparison, we make use of two types of datasets: synthetic data (10,000 node random, mesh, small-world, and scale-free networks [56]) and real-world data (consisting of large, representative complex networks sized between 1900 and 29,000 nodes).

In this section, two sets of results are detailed. First, we explore the correlations between ranking methods for assigning top spreaders. Naturally, within the top of nodes ranked by different centralities, we will eventually find common nodes. As such, we detect the amount of node overlapping and express the correlation of the two measures as and . For the second experimental phase of benchmarking influence ranking methods, we ensure that by alternatively assigning a node to each set, while removing it from the list of candidates of the other centrality, as explained by the AOA approach (Figure 2).

3.1. Correlations between Influence Ranking Methods

Real-world datasets can be viewed as topological compositions of the basic graph properties found in synthetic Erdos-Renyi random (Rand), Forest-fire mesh (Mesh), Watts-Strogatz small-world (SW), and Barabasi-Albert scale-free (SF) networks [5658], so we solely rely on measurements on the synthetic datasets from Table 1. As such, the correlation process is applied on the four synthetic network types in order to better highlight distinguishable characteristic topological features, like uniform node degree distribution (random networks), high local clustering and community formation (mesh networks), and high clustering and long-range links (small-world), respectively, low average path length, and hub formation (scale-free).

Figure 3 presents the correlations between selected pairs of centralities; correlations are measured by considering the following spreader set sizes: , where and is the size of the graph, and find that will drop slightly as increases. The average changes in spreader correlations from up to are  = −0.289,  = −0.193,  = −0.189, and  = −0.088. This overall drop in correlation can be explained as follows: more of the same nodes are determined as top spreaders by ranking methods when the spreader sets are small. As increases, each ranking method adds more nodes to the set of spreaders and the chances of overlapping drop. However, when we look at each individual centrality measure in turn, we notice that some increase the correlation amount, while others drop that amount. Section 1 and Figure 1 in the Supplementary Materials detail and discuss these measurements for 10 selected ranking methods, over the four synthetic topologies, as increases.

As a representative overview, we present in Figure 3 only the results for . For each centrality combination, we provide the numerical correlation and a symmetric graphical correlation. For example, the correlation degree-Hirsch index in the random network is , which translates into a mid-blue gradient in the table symmetric cell . The last column in the table expresses the average correlation on each line. Summing up and averaging the values on the last column, we obtain the cumulated correlations for each topology as , , , and .

Quantitatively and also intuitively, the highest spreader correlation is obtained on the scale-free network, as it naturally consists of a very small core of hub nodes. These hubs act like an invariant to in the topology and are likely to be selected as top spreaders by all centrality measures. Even if is changed, the correlation remains high (see Supplementary Materials, Section 1). On the opposite spectrum lie the random and mesh topologies. Both are characterized by uniformity in node properties, so that various centralities will have a higher heterogeneity in their top spreader selection, leading to the smaller measured correlations. Lastly, the small-world network borrows the uniformity of meshes and the long-range links of a random network. Here, we measure a relatively high average correlation of 0.606, denoting that this network has a stable core of influential nodes, like the scale-free network.

Analysing each centrality in turn, we notice that there are higher correlations between ranking methods of the same category, for example, diffusion-based HITS, PageRank, and LeaderRank. Furthermore, some centralities are more suitable for some topologies and less efficient for others. For example, we confirm that degree is considerably more relevant for scale-free networks (correlation of 0.802 with other centralities), but only marginally relevant for the small-world network (correlation of 0.437). The same observation is consistent with closeness and betweenness. To better highlight the spatial overlapping of spreader nodes, we provide a visual example in the Supplementary Materials, Section 2.

Arching over the presented results, we motivate the usage of alternate opinion assigning (AOA), because we find high node overlapping, ranging between 30% and70%, between all state-of-the-art centralities.

3.2. Independent SIR Simulations

For a comparative basis, we first estimate the efficiency of an influence ranking method by employing classic SIR simulation [41, 42]. In this sense, we measure both the time needed to infect the majority of nodes (expressed in simulation iterations ) and the final coverage of the infection (expressed as a percentage of the total network size). We use the following SIR-specific parameter values [40, 41]: (i.e., top 5% nodes selected as spreaders), (i.e., at least 95% population to be infected as a stop condition), (i.e., 5% probability to become infected during an interaction), and (i.e., 10 iteration duration of infectious state for a node).

The simulation results in Table 2 represent averaged values for and by running 10 repeated simulations on each dataset, for each individual ranking method (i.e., amassing to a total of simulations). Through these results, we want to highlight that running a diffusion process for each ranking method in an individual manner (i.e., one by one), the provided feedback regarding ranking efficiency, is often limited.

The results for most topologies are very close in terms of measured and , suggesting that differentiation between ranking methods is unreliable. For instance, analysing the coverages in Table 2, the average coverage for Rand is with a standard deviation of only . The measured difference between the most efficient ranking method (Hirsch index) and least efficient ranking method (degree) is only on the Rand network. Similarly, the standard deviations for real-world networks are , , , and . The differences between the most and least efficient ranking methods are roughly , , , and . For a visual representation of the coverage benchmark results refer to Supplementary Materials, Section 4.

We consider these simulation results to highlight an overall lack of perspective regarding which ranking method is better on a given topology. Likewise, the best ranking methods are not consistent across datasets. For instance, HITS turns out to be the most efficient ranking method on a SW, but the least efficient on a SF network; Deg is least efficient on Rand, 2nd on Mesh, 7th on SW, and 6th on SF, yet it comes 8th if we average all results; Btw is the 5th on OSN, 4th on FB, 5th on Emails, and 3rd on POK, and comes 3rd overall. This kind of inconsistency further supports our claims for an improved type of benchmarking methodology.

3.3. Competition-Based Simulations

We let each of the selected centrality measures compete in a one-to-one scenario over the 4 synthetic and 4 real-world datasets. Every dataset comprises a total of pairs of simulations, translating into individual simulations due to AOA. For statistical rigour, each experiment is repeated 10 times, consisting of a simulation batch of 20 simulations, leading to simulations per dataset, amassing to an overall unique experiments. The large quantity of numerical results is available in the Supplementary Materials, Section 3 and Tables 1 and 2.

Condensing the simulation results, we present in Table 3 the average performance of the 10 ranking methods on the 8 datasets. This performance is quantified as an average percentage of opinion coverage obtained from the one-to-one competition benchmarks (e.g., HITS obtains a coverage of 65.23% on the OSN dataset).

Similar to the state-of-the-art SIR epidemic benchmarking, our obtained results are easy to understand and offer the possibility of direct comparison between ranking methods on the same dataset. On the other hand, we notice two improvements by applying our methodology: (1)There is much higher variation between measures on the same dataset. For example, on the FB dataset, we obtain and , which suggest an obvious performance difference. On the other hand, using SIR as benchmark, the coverages are and .(2)There is greater emergent granularity between measures on different datasets. For example, Cls turns out to be much less efficient on a SF topology (1.99%) than on a SW topology (18.37%).

Assessing the results in Table 3, we find an objective comparison of state-of-the-art ranking methods used in current social networks research. Figure 4 presents these cumulated performance indicators; the top three ranking methods, according to our original proposed methodology, are LeaderRank (LR), HITS, and node degree (Deg).

The cumulated results in Figure 4 are based solely on the 8 datasets used throughout the paper. With more datasets used, the averaged performances will slightly differ. However, valuable insight is further offered by the visualization of performances on each dataset in turn; these results are detailed in the Supplementary Materials, Section 5.

Additionally, we provide a suggestive visual example of the opinion coverages at the end of a simulation, after balancing is attained [53] with our used tolerance diffusion model [45]. The Mesh topology is exemplified here because it offers the most intuitive 2D spatial feedback after applying a force-directed layout. To this end, Figure 5 shows the coverage of competing centrality measures in three different scenarios: (i)Two ranking methods with high overlapping and balanced outcome: Deg (orange) 56.70% and LR (blue) 43.30% (Figure 5(a)).(ii)Two ranking methods with moderate overlapping and inefficient seed selection for one method (Btw): LR (orange) 74.26% and Btw (blue) 25.74% (Figure 5(b)).(iii)Two ranking methods with low overlapping and extreme outcome: Cls (orange) 5.24% and HI (blue) 94.76% (Figure 5(c)).

The validation of our novel benchmarking methodology employs a standard strategy for the selection of multiple spreaders. After a review of the most recent advances in complex network analysis, we find that the method of simply selecting the top spreaders from the entire network is consistently found throughout literature [35, 37, 38, 5962]. Nevertheless, there are several alternatives for selecting multiple spreaders which we detail in the Supplementary Materials, Section 6.

3.4. Comparison between Benchmarking Methods

To highlight the superior quantitative power of our competition-based benchmark, we aggregate the results in Table 4. Here, we measure the difference between the most and least efficient ranking methods and the difference between the top two ranking methods, for each dataset in turn. Seeking higher overall differences, we find that our proposed benchmarking methodology is more insightful, in general, than the classic SIR benchmark. As such, when measuring , individual SIR benchmarking only manages to produce differences of ≈ (1.14% on average) between ranking methods, while our proposed solution offers differences of ≈ (91% on average). When trying to discern between the top 2 ranking methods on a particular dataset, SIR manages to place them apart by only ≈ (0.31% on average), while our method manages to produce higher differences within ≈ (3.56% on average).

Another advantage of our proposed method is the overall uniformity obtained for the performances of each centrality across the 8 selected datasets. For instance, if LR and HITS result as the most efficient spreading methods on one topology, their performance is replicated with high confidence on the other topologies as well. When employing SIR benchmarking, the performances are not consistent across datasets. This aspect is suggested visually in Figure 6, where we highlight the most (LR) and least (Cls) efficient centralities, as they are ranked over the 8 datasets. It is easy to notice how LR is positioned in the top 3 and Cls in the last 2-3 methods overall. In the individual SIR benchmarking, there is no such uniformity.

In conclusion, our benchmarking methodology—which is specifically designed for the competitive social network context—provides significant quantitative separation between influence ranking methods on synthetic and real social network topologies. This numerical separation is over one order of magnitude greater than the one provided by classic SIR simulation—a standard methodology used in epidemic spreading, where the diffusion context is less competitive and more ego-centred. Therefore, we encourage the use of our proposed method in specific real-world applications of dynamic social networks.

4. Discussion

One of the significant research challenges in network science is to rank a node’s ability to spread information in a network [43]. As spreading is used to model real-world processes such as epidemic contagion and information propagation [2, 3, 20, 22, 63], our paper aims to improve current methodology in validating and comparing state-of-the-art ranking methods in the social network context. Numerous alternative ranking methods have been developed, relying on classic graph centralities, localized targets [63], optimal percolation [43], and so on. While the challenge at hand remains partially unsolved, it is argued that insights are uncovered only through the optimal collective interplay of all the influencers in a network [43]. This emergent behaviour is also the key to our study, namely, the introduction of a benchmarking technique employing simultaneous competition-based spreading.

The main motivation of this paper is the need for increased realism in the social network context, where real-world applications imply simultaneous diffusion by their nature. Nevertheless, our methodology may be tailored to other interdisciplinary fields of science. One area of research that can benefit directly from our methodology is network biology. Specifically, determining node centrality is a hot topic in biological networks. For instance, a study shows that the phenotypic consequence of a single gene deletion is determined by the topological position in the molecular interaction network [64]; also, the relationship between the network roles of disease genes and their tolerance to germs shows that cancer driver genes occupy the most central positions [65]. Many biological studies rely on the theoretical results from network science, and they often only employ degree and betweenness centrality in their analysis. With our study, we aim to broaden the methodological perspective for interdisciplinary fields.

We find advantages over existing benchmarking methodology relying on the SIR epidemic model. Notably, our competition-based method offers much greater quantitative separation between ranking methods on the same dataset (e.g., degree is roughly 14 times more performant than closeness on the Facebook dataset); also, we obtain higher granularity for a ranking method on different datasets (e.g., closeness is roughly 9 times less efficient on a scale-free topology than on a small-world topology).

Further development ideas of our method are possible. For instance, one can increase the number of spreaders acting simultaneously in a network from 2 to . Accordingly, alternate opinion assigning (AOA) must be modified to fit the opinion sources. The recent study discusses the importance of targeting specific localized targets, rather than obtaining a high coverage of the network [63]. Our method can be easily implemented to measure the target coverage during or at the end of a spreading simulation. Another study finds that each complex network may have a small “control set” of nodes, which, when triggered, will influence the whole network [66]. These control sets are believed to be surprisingly small (5–10% of nodes) and may also be paired with our benchmarking methodology.

Finally, we consider that the topology-aggregated competition-based results we obtained (e.g., in Figure 4 of the Supplementary Materials) can be used to define a functional fingerprint of real-world networks based on how influence ranking methods perform on them. Namely, we notice that the 10 used centrality measures perform in a unique, distinguishable manner on the four fundamental synthetic topology models. This uniqueness can be quantified as a characteristic vector for random, mesh, small-world, and scale-free networks. Any real-world dataset can then be compared to other datasets through these four fingerprint vectors. Overall, we believe that our work improves a significant challenge in the study of opinion spreading phenomena and also serves as a good starting point for many of the still unsolved problems and new ideas found in literature.

5. Methods

5.1. Validation Datasets

We motivate the inclusion of synthetic datasets into the study to clearly distinguish between characteristic topological features of the network that influences spreading. These features include a normal versus power-law degree distribution, lower versus higher clustering, lower versus normal path lengths, existence of long-range links, or hub formation, respectively. The four chosen network models represent the four fundamental topology types out of which empirical networks are further built [26, 56, 57].

With a higher interest on influence spreading pertaining to the field of social network analysis, we choose four undirected (weighted and unweighted) networks consisting of various types of social relationships. As such, we rely on a weighted online social network (OSN) with 1899 users [67], an unweighted Facebook friendship network (FB) consisting of the 3172 students from a Computer Science faculty in Romania [68], an unweighted email exchange network (Emails) from London’s Global University with 12,625 contacts [69], and a weighted friendship network (POK) with 28,876 users from the Slovakian POK platform [70]. On the other hand, all synthetic networks consist of 10,000 nodes and are algorithmically generated using default parameters found in the state of the art. Table 1 provides the basic statistics for each such network.

5.2. Influence Ranking Methods

In order to define each centrality metric, we make use of the following graph theory-specific notations. A social network is a graph formed out of number of nodes and number of edges. The edges may also be directed (i.e., ) or weighted (i.e., they have weights ). The connectivity of the graph is characterized by an adjacency matrix , where (or in weighted context) if nodes and are connected and 0 otherwise. Furthermore, the degree of a node is denoted as , the neighbourhood of a node is the set of nodes , and the average degree of is .

The reviewed measures considered for benchmarking in this paper are classified in one of three categories: structure-based, location-based, and diffusion-based rankings.

5.2.1. Structure-Based Measures

Structure-based measures require the topological information of the graph—either local (e.g., ego network, vicinity) or global (e.g., path-based). Under local measures, we first mention degree centrality (Deg) of a node ; it is easy to use and efficient but less relevant in some real-world scenarios [34, 38], as some studies show that Deg fails to identify influential nodes because it is limited to the ego network of each node [34, 71].

The local centrality (LC) measure was introduced as a trade-off between the low-relevant degree centrality and other time-consuming measures [34]. LC of node considers both the nearest and the next nearest neighbours and is defined as where is the vicinity (set of neighbours) of node , is the number of the nearest and the next nearest neighbours of node , and is sum of over each node in . LC can be considered as more effective than degree centrality because it uses more information from the vicinity of distance 2 but has much lower computational complexity than betweenness and closeness centralities.

Another method considered a local ranking measure is ClusterRank (CR), proposed by Chen et al. [35]. CR quantifies the influence of a node by taking into account not only its direct influence (out-degree ) and influences of its neighbours (like in the case of PageRank) but also its clustering coefficient [56]. Formally, the ClusterRank score of a node is defined as where the term represents the effect of ’s local clustering, the term +1 results from the contribution of itself, and is the vicinity of node . Based on empirical analysis [35], the authors propose the exponential function .

The local centrality with a coefficient, denoted as CLC by Zhao et al. [71], is a combination of the previous CR and LC methods. The number of neighbouring nodes is measured to identify cluster centres and is combined with a decreasing function for the local clustering coefficient of nodes, called the coefficient of local centrality , namely, . Mathematically, the influence of node is measured as

Considering the global information of the graph can give better insights, so we adopt the widely used betweenness (Btw) and closeness (Cls) centralities [56]. Betweenness of a node is expressed as the fraction of shortest paths between node pairs that pass through the node and is defined as [26] where is the number of shortest paths between nodes and and denotes the number of shortest paths between and which pass through node .

Closeness centrality of a node is defined as the inverse of the sum of distances to all other nodes in ; it can be considered as a measure of how long it will take to spread information from a given node to other reachable nodes in the network [56]:

5.2.2. Location-Based Measures

Location-based measures also require the structural information of the graph but focus around the belief that the location of a node in a network is a more relevant. Driven by the limitations of simple graph metrics, such as degree centrality, Kitsak et al. propose -core decomposition to quantify a node’s influence based on the assumption that nodes in the same shell have similar influence, and nodes in higher-level shells are likely to infect more nodes [28]. To this end, the -core decomposition method was validated by several studies [28, 29]. While this method is often found in literature under both the names of -core or -shell decomposition, the two concepts differ. The -core of a graph is the maximal subgraph such that every vertex has degree at least . A -shell (KS), on the other hand, is the set of vertices that are part of the -core but not part of the -core.

Experiments show that by running a diffusion process on the network (e.g., SIR), the nodes with the same values always have different number of infected nodes, namely, spreading influence [32]. This phenomenon suggests that the -core decomposition method is not appropriate for ranking the global spreading influence of a network. Liu et al. [32] propose to solve this observed drawback by taking into account the shortest distance between a target node and the node set with the highest -core value. In terms of the distance from a target node to the network core , the spreading influences of the nodes with the same -core values can be distinguished using the following equation:

In (9), is the largest -core value of , is the shortest distance from node to node , is the network core, and is the node set whose -core values equal .

In this paper, we also make use of the Hirsch index. The -index (HI) [72] is a hybrid location-local-based centrality in which every node needs only a few pieces of information: the degrees of its neighbours. It was originally developed as a means to measure the scientific impact of scholars, but it now finds uses in quantifying the influence of users in social networks or drugs in pharmacological interaction maps. The -index of a node is defined as the largest value so that has at least neighbours with a degree .

The algorithm is intuitive to apply, namely, for a node with vicinity , we order all its neighbours in descending order of their degree . The -index is the position in the ordered list of nodes at which the degree of a neighbour becomes smaller than the position in the list. For example, given the list of degrees , we deduce , because , but .

5.2.3. Diffusion-Based Measures

Diffusion-based measures are based on obtaining a state of balance in the network after applying a nondeterministic spreading processes, like a random walk. We make use of the fundamental eigenvector centrality (EC), which supposes that the influence of a node is not only determined by the number of its neighbours (i.e., degree centrality) but also by the influence of each neighbour [73]. Inspired by EC, there are three additional algorithms we discuss in this paper.

PageRank (PR) was first implemented as a random walk on the network of hyperlinks between web pages [74]. A damping factor is introduced as the probability for a user to jump to a random website, and is the probability for the user to continue browsing through hyperlinks. The influence of a node at time is given by where is the number of nodes in , is the out-degree of node , and , but requires step-wise optimization based on the network.

is similar to , based on the concept that good hub nodes will point to good authority nodes, and good authorities will point by good hubs [75]. The hub score of all nodes at time is initialized with 1; the authority score , at any moment in time , is expressed as

Finally, the LeaderRank (LR) algorithm represents an improvement over PR, since the probability parameter is adaptive, leading to a parameter-free algorithm directly applicable on any type of the complex network [37]. The method is applied by adding an additional ground node that is connected to all other nodes, ensuring the graph is connected. A random walk then adds a score of +1 to each visited node . The ground node starts with , and all other nodes in have . Using the notation at time for a node , the evolving score can be expressed as

The score is proven to converge towards a steady state at time [37]; the score of the ground node is then evenly distributed to all other nodes to conserve the scores on the nodes of interest. The final, stable LR score is expressed as

Data Availability

The real-world social network datasets supporting this research article are from previously reported studies and have been cited individually in the Methods section. The generated (synthetic) datasets are available from the corresponding author upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

This research was partly supported by the Romanian National Authority for Scientific Research and Innovation (Unitatea Executiva pentru Finantarea Invatamantului Superior, a Cercetarii, Dezvoltarii si Inovarii), Project PN-III-P1-1.1-PD-2016-0193.

Supplementary Materials

Figure 1: changes in correlation of node overlapping, for the 10 analysed ranking methods, as the spreader size is increased from 1% to 10% of the total network size . Each synthetic network has nodes. Figure 2: spatial distribution of selected spreader nodes on the mesh network with nodes. The top nodes are highlighted as spreaders, as determined by the degree, closeness, betweenness, and PageRank centralities, respectively. Table 1: synthetic dataset (i.e., random, mesh, small-world, and scale-free) benchmark results for pair-wise competition between centrality measures. Each cell (, ) contains the final opinion coverage (0–100%) for centrality ; the symmetric cell (, ) represents the same number on a colour gradient blue (0%), white (50%), and orange (100%). Table 2: real-world dataset benchmark results for pair-wise competition between centrality measures. Each cell (, ) contains the final opinion coverage (0–100%) for centrality ; the symmetric cell (, ) represents the same number on a colour gradient blue (0%), white (50%), and orange (100%). Figure 3: performance of each ranking method (i.e., coverage 0–100%) on the 8 datasets using individual SIR benchmarking. Figure 4: performance of each ranking method (i.e., coverage 0–100%) on the 8 datasets using simultaneous competition-based benchmarking. Figure 5: comparison between the naïve (a–c) and graph colouring (d–f) methods using three competitive diffusion examples on the mesh network ( nodes). Larger nodes represent spreader nodes. The first centrality in the figure captions corresponds to orange opinion and the second centrality to blue opinion. Figure 6: difference in spreader spacing for closeness (orange) when switching from the naïve method (a) to the graph colouring method (b). Table 3: comparison between the naïve and graph colouring methods in terms of selecting spreader nodes. Performance is expressed as percentage (%) for each node centrality in three competitive simulation scenarios. (Supplementary Materials)