Abstract

The researchs on the structure and formation mechanism of social networks lead to several models with differences in the attachment patterns of new links (edges). In fact, the driving factor behind the addition of new links is just as important as the attachment patterns, while very little attention has been devoted so far to this exploration. We present an agent-based model which could successfully reproduce large-scale social networks. We find that the structure of social networks is a consequence of continuous individuals’ decision-making processes based on self-evaluations and the turnover of the population. The individuals’ self-evaluation processes are key motivating factors for the addition of new links, while the attachment patterns and the turnover of the population should be responsible for the topology of social networks. The resulting networks of our model display dynamics between order and randomness, which is greatly consistent with current observations and research achievements of social networks. We also find that some plausible properties of empirical data are actually artifacts due to the boundedness of sampling. Our research has revealed the driving factors behind the evolution of social networks as well as the underlying evolving patterns. These findings will lead to a better understanding of social structures.

1. Introduction

Complex networks have gained increasing enthusiasm in various fields ranging from natural science to social science in the last few years. The achievements of data acquisition make it possible to calibrate some hypotheses once supposed to be reasonable in people’s mind, one of which is networks of complex topology described with the random graph theory of Erdös-Rényi (ER) [1], whose degree obeys a Gaussian distribution. Barabási and Albert (BA) have developed a growing network model to interpret the emergence of scaling in networks, known as BA model [2], which leads to a degree distribution . However, it is obviously inappropriate for applying it mechanically to social networks for two reasons: first, the population is relatively constant, which does not conform with the hypothesis of unceasing increasing of nodes; second, individuals could not be present in all the lifetime of social networks, instead, the living individuals will die after a few years’ survival.

Researchers had realized these problems not long after BA model was proposed [37]. Furthermore, the analysis of empirical data [8, 9] shows that some social networks exhibit single-scale properties for the degree distribution rather than power law regime. So González et al. [5] and Singer et al. [7] introduce constraints named “aging effect” to limit the addition of new links and the results indicate that these new observations are well fitted. However, there are still some problems with these models: (i) the empirical networks used in these models come from an in-school questionnaire among junior high school students from the USA (for acquaintance network, it comes from a questionnaire among 43 Utah Mormons), so are these social networks from specific groups universal and representative? Furthermore, could these models be used to explain the formation of large-scale social networks? (ii) Constraints introduced by these researches are more passive responses to the dilemma of continuous addition of new links, which could not lead to a deep understanding of the driving factors behind the addition of new links. We have developed a new agent-based model for large-scale friendship networks, which successfully reproduces acceptable properties of social networks such as small-world phenomenon and community structures [10, 11]. We also find that the resulting degree distribution of social networks displays obvious scale-free regime, indicating that the observed single-scale properties of empirical networks are artifacts resulting from the boundedness of sampling. These findings give a clear answer to the questions above and confirm the drawbacks in the previous models.

2. Materials and Methods

There are two key features of real social networks neglected by foregoing researches: (i) the continuous turnover of the population—in real worlds, the aged die at a certain rate while new individuals are born at an appropriate rate resulting in an almost unchanged number of individuals present at the networks over time; (ii) on-going individuals’ decision-making processes—the living individuals carry out social activities based on their current neighbors in the network including families, schoolmates, colleagues, and sometimes strangers and in the case of dissatisfaction with current circles of friends, they will try to build new relationships with some strangers. These ingredients are incorporated into our model framework. We also introduce an index named social activeness index, which is a comprehensive measurement about individual’s activeness in social networks, denoted by . An active individual will be likely to make more friends. Integrated with the definition of natural attributes, we finally get an agent-base model in which individuals are characterized by a set of natural and social attributes, including identity number, age, residence, and social activeness index, along with behavioral rules.

At the beginning of the modeling, there are individuals (agents) with no initial connections, who are randomly placed into an area of grids. This is close to the reality that the city is usually composed of a number of communities and different individuals share the same community. For simplicity, we assume that the identity number of individuals increases from 0, so an elder individual will get a smaller identity number. Each individual’s age is an integer selected randomly from 1 to max_age (max_age = , where is the death rate of the population). Social activeness index is an important parameter indicating individual’s social activeness, which is assigned a value ranging from 0 to 1 generated from a specific distribution when an individual is born. Figure 1 illustrates our model framework. At each time step, every individual will evaluate the situation of his social contacts based on the current number of neighbors in the network (degree). Here, we introduce the utility function [12] which is widely used to quantify consumer’s total satisfaction from consuming a good or service in economics because of the observed striking similarities among many collective human activities [1317]. We employ an exponential utility function , where is individual’s degree and is the social activeness index. So decreases with the decreasing of , that is, the individual would be more dissatisfied with the current situation and more likely to make a new contact. Under the circumstance of dissatisfaction, individual will make a new friend with based on three criteria: age difference (), grid distance (), and the number of shared friends (). These criteria have been observed in many social networks and employed by some foregoing studies [3, 5, 11, 18], while never integrated into a single model. So the probability making a new contact with is given by where where is the degree of , , , are scale factors, and . Obviously, the effects of age difference and grid distance on social networks increase with the increasing of and , while does the opposite to . After every individual makes his decision independently, the network updates. We assume that the friendship is reciprocal, when adds into his friends list and so does . Then a small number of individuals are removed from the network along with the links with them and the same amount of individuals with no initial links is added, respectively, corresponding to individual’s death and birth in real world. The number is determined by the death rate with (noticing that we assume that the population is relatively constant, so is also the birth rate).

3. Results and Discussion

In Figure 2(a), we show the degree distributions of truncated normal distribution with different and (the uniform distribution could be regarded as a limit form of truncated normal distribution with a large ). Note that is the individual’s social activeness index, so a normal distribution implies that most of individuals have similar social activeness, while only few of them are very active or inactive. With the increasing of , will be much more diverse. Our modeling suggests that the heavy tail will be more straightforward with a decreasing of the scaling exponent. When coming to the uniform distribution, the scaling exponent is approximately equal to 2.38. Sensitivity analyses of , , , and show that they do not significantly change the scaling exponent (see Figure 3).

When exploring various complex networks, we have a gut feeling that social networks are of strong robustness. It is easily accountable—individuals’ death should not greatly shatter social networks, although they are hubs (people with a lot of links). So we compare the output of our model to some other networks, for example, random network from ER theory and scale-free network from BA model. As Figure 2(b) shows, the obtained network’s robustness is between BA and ER. If we regard the BA network as a well-ordered network and correspondingly the ER network as a random network, the obtained networks will be between order and randomness, just as Watts [18] described. In order to make the contrast of different networks more clearly, we define a shatter index as where denotes the number of nodes removed, and is the nodes’ number of the maximal connected subnetwork in the remained network after nodes have been removed. Noticing that for a full connected network, the geometric meaning of is actually the area enclosed by curves in Figure 2(b) and the coordinate axes.

We also check the grid distance and age difference between friends in the networks. As Figures 4(a), 4(b), and 4(c) show, the regional structure will be more conspicuous with the increasing of . Figure 4(d) gives the analysis result of age difference between node pairs in obtained networks with different . A significant positive correlation between the age effect and emerges from the comparison of different curves.

The shortage of empirical large-scale social networks makes a straightforward verification of the obtained networks from comparison impossible. However, researches in related areas have provided some strong circumstantial evidences. In the past few years, the tight correlation between the spreading of infectious diseases and social networks has aroused widespread concerns. Eubank et al. [19] proposed a large-scale simulation framework based on realistic urban social networks. Though edges in their networks are defined as contacts between individuals, they also give a lot of valuable information about friendship networks because of the interplay between contact network and friendship network. Namely, close friends tend to make more contacts, while frequent contacts will lead to an intimate relationship. Their studies show that real networks are strongly connected small-world graphs with a well-defined scale for the degree distribution and could not be shattered by removing a small number of high-degree vertices, which coincides with the obtained network of our model.

Next, we will show by sampling individuals from the obtained social networks and reconstructing their interconnections that our model successfully reproduces the observed single-scale properties in the in-school friendship networks. Owing to the obvious age effect in in-school friendship networks, we employ a large , indicating that when individual makes a new friend, there is an obvious tendency to individuals from the same generation. After evolution over few time steps, we pick up a group of individuals of similar age as well as their interconnections. In this way, we get a subnetwork corresponding to in-school friendship network, as shown in Figure 5(c). It is visually apparent that the obtained in-school friendship network is composed of a number of different size communities. Figure 5(b) shows the connectivity distribution of in-school friendship network, a striking similarity between with them and the observations [5, 7, 20] could easily be detected; however, the degree distribution of the full social network exhibits a visible heavy tail property (Figure 5(a)), indicating that the observed single-scale properties in empirical in-school friendship and acquaintance networks are more artifacts than universal phenomena, leading to an inability of foregoing models to describe social systems.

4. Conclusions

In this study, we have uncovered the structure and formation of social networks by capturing some important features of real world. This has significant potential in interpreting lots of social phenomena related to human activities. Further study will focus on the effects of social networks on individual’s migration, which may help understand heterogeneity of human geographical distribution.

Conflict of Interest

The authors declare that they have no conflict of interests.

Acknowledgments

This study was funded by the Ministry of Science and Technology of China Grant 2013ZX10004605 and the National Natural Science Foundation of China (NSFC) Grant 90924019.