Abstract

Web-based Social Networks (W-bSNs) have recently experienced a significant rise regarding users and the number of relations among them. Twitter is a case of W-bSN in which the relevance of the commentaries posted influences how users create new relations. The reputation of a user has a direct effect on the perception and opinions of other people and can be appropriately used to obtain advantages. The thought expressed by an influential user can produce, as an effect, which other users changed an idea about a topic. In this work, we present the design and results of the empirical study to analyze the cross influence among users, for their interest, and the messages they post and how relevant these messages are in how we create new relations. One of the main contributions of this approach is to analyze the behavior of users and the impact of the diversification of topics and the inclusion of additional resources to the tweet such as videos, images, or URLs. Finally, the experimental results show that the proposed strategies are efficient for all accounts.

1. Introduction

Web-based Social Networks (W-bSNs) have enormously evolved since their dawn in the late 90s. Many had a relative success, such as Microsoft©’s MySpace® or Hi5®, but, eventually, they disappeared. However, they established the basis for developing a new generation of W-bSNs. Recently, there has been renewed interest in two of the most used W-bSNs worldwide: Twitter® and Facebook®. Each of them has a considerable amount of users, and their activity represents many terabytes of information exchanged every day [1].

Twitter is considered a W-bSN; however, due to its characteristics, it is named as microblogging [2]. Microblogs are a short textual message to talk about their periodic activities, to seek or share information [3].

It implements an asymmetrical operation model, since there is no need for an agreement or authorization in order, for a given user, to create a new relation with another user. In Twitter, this relation is called “follow.” A user willing to receive updates of the commentaries posted by another user, called tweets, only needs to “follow” her, with no action required by the latter. This, users’ principal activity on Twitter is posting and sharing tweets, which are short messages of at most 140 characters long, expressing thoughts, ideas, feelings, or opinions [4]. Another action that can be performed in Twitter is called retweet (RT), which corresponds to spreading a message before it is tweeted or retweeted by another user. It is important to note that the retweeting users do not necessarily have to be following a given user for this to be able to RT a tweet. However, most often, the users retweet the tweets posted by the accounts that they follow. This user behavior resides in various social activities that users can do online, like content publishing, retweet, profile browsing, commenting, and so forth. These actions help us to understand their users’ interests in different activities. In our case, the primary concern is to understand the aspect of connectivity and interaction across the RT [5].

Hence, this model produces some tweets which, given the relevance that they may represent for other users, can be widely spread, becoming eventually viral and, using the Twitter terminology, can become a Trending Topic (TT). Thus, a TT is a topic many people are interested in and, in consequence, they decide to spread it. Given these characteristics, Twitter has emerged as a suitable platform through which people can try to become popular, that is, more followers every day or in short periods. Since this is the utter intention, a given user will try to post tweets on interesting or controversial subjects, so the people (followers) can be interested in those tweets and eventually can become his followers [6]. Also, there are passive users who only want to keep informed about tweets of the people they follow and do not have an active presence on Twitter. The active users become a subject of study to determine the factors that can, eventually, allow a user to get as many followers as they can. A user with a substantial number of followers can be considered as an influential one and has the capacity of causing an effect in different ways. It means each idea he expresses will automatically reach a big number of users of the social network.

In this paper, we present results of our work, which consisted of analyzing the factors that may lead to a user getting more followers and, in this way, becoming popular.

This paper is organized as follows. In Section 2 we describe the work related to our research and what other authors have proposed. In Section 3 we present the support for our experimentation, in Section 4 the elements necessary to initiate our experimentation, and in Section 5 the results of our experiment. In Section 6 we draw some conclusions and propose some reasonable directions for future research. Finally, we present the consulted references.

Since its creation in 2006, Twitter has gained notoriety and popularity. Currently, it has about 313 million active users monthly. It has become a near real-time information spreading way widely used worldwide, and its relevance has enormous growth. As explained before, some users tend to have more followers since other users look at their opinion; in other words, they create a reputation. These users are considered as influential ones. The relevant element to analyze here knows which factors can be regarded as important to confer some users more followers, and how it influences the number of followers of other users. Several works study the influential or valuable users [79], the impact of tweeting and retweeting [1013], viral marketing [1416], and others, which are used to understand the spread of information and the level of the user influence.

Cha et al. [17] present an empirical study of the patterns of influence on users considered as popular. They consider three key features: in-degree, understood as the numbers of users following the user under analysis, some RTs, and many mentions; in other words, it represents the popularity of a user. The authors indicated that the follower count is not the only attribute to measure the influence; the RTs and mentions must be considered. They propose the existence of influential users, that is, those who can make their tweets be widely retweeted and receive a big amount of mentions. From their analysis, they concluded that such users tend to publish tweets on controversial subjects. Also, their study revealed that users who limit their tweets to a single topic show a greater increase in the rating of influence.

Romero et al. [18] materialize the intuitive idea of some users being harder to influence because they are not interested in creating or sharing information and argue that a greater part of Twitter users are passive. According to the author, passivity is a barrier to propagation; while some users retweets a lot, others do it not very often. The authors mention the following assumptions: (a) user’s influence score depends on the number of people who influence and their passivity; (b) the influence can be by several shreds of evidence, such as retweets and other cooccurrences of content in general. They propose an algorithm similar to Hyperlink-Induced Topic Search (HITS) and PageRank to measure the influence considering not only the number of followers but also the RTs and mentions. They found that influential users are highly active and therefore defined a new influence measure based on user activity.

Retweeting, as stated in [19], has a preponderant importance in Twitter, since the fact of executing such action indicates not only interest in a given tweet, but also the level of confidence deposited in the original publisher and the agreement with the content. This is an important conclusion that helps us to support our work later.

Another case, presented in [20], affirms that the propagation of the information tends to happen by users that have shown to be influential in the past and who also have a substantial number of followers. They propose a formula that allows measuring the influence of users, taking into consideration the number of RTs and number of mentions they have.

Other researchers have focused on offering methods to measure the influence of users in subjects with similar topics. Anagnostopoulos et al. [21] define the level of influence as the fact that one person can induce another person to act similarly. Such types of users are called “active.” They present a probabilistic model that evaluates when a user becomes active in a period, and they assumed that their friends (this is how the author refers to the followers) increase their probability of becoming active too. They concluded that people are influencing each other every discrete time and estimated the maximum likelihood.

In [22] Crandall et al. study the influence of users based on the homophily. This term refers to the level of similarity of people that interact [23]. They divide into social influence and selection. The first is when people pick up behaviors related to people whom they interact with, and the selection is when they seek out for similar users to interact with. They quantify the similarity of users over time considering the topic of interest of each user. The authors proposed a model of user behavior where individual users can interact with others and then select the users with a higher number of activities and interactions referred to as influential users.

The work of Weng et al. [24] consists of identifying influential users on Twitter. The strategy is similar to the PageRank algorithm. The proposed algorithm considers topics extracted from tweets. One of the main contributions is that they compute each user’s topic distribution based on their tweets using LDA, showing that subjects of connected users are correlated significantly.

Leavitt et al. [25] present a new methodology based on the results of the analysis of 12 popular accounts, which allow determining the measures of influence on Twitter. They examined an ecosystem of 134,654 tweets, 15,866,629 followers, and 899,773 followees over a period of 10 days. Authors define influence as the action that a user can persuade another user to initiate a follow-up, a retweet, and/or a comment.

In [5] Jin et al. measure the user’s popularity by the number of followers, retweet count, and PageRank. The obvious fact is that celebrities like actors, athletes, musicians, and so on or the news accounts are on the top of the list. The authors conclude that the metric related to the number of followers is not sufficient to reflect the influence of the users. The retweet count and the PageRank metric showed the celebrities users and news accounts at the top. The authors mention that metrics altogether have the popularity of the user.

Compared to the previously presented works, ours can be considered as an experimental framework allowing us to analyze the real impact, using real data, which tweets, RTs, and mentions may have in the level of popularity of Twitter users. We assume as our hypothesis that RTs and mentions made by influential users have an effect on the number of followers of a given user.

3. Basic Experiment Framework

The influence is the ability that an individual or a group of individuals have to modify the perception or beliefs that other people have about a given subject. The reputation of a user has a direct effect on the perception and opinions of other individuals and can be actually used to obtain advantages. The idea expressed by an influential user can show that other users change their mind about what they thought before. Finally, there have been studies in fields such as sociology, politics, and marketing about the influence that experts or recognized people, in specific areas, may have, to understand why certain trends appear. For example, a campaign turns to be more efficient if a message related to it becomes viral. The theory of the traditional communication [26, 27] affirms that a minority of people belonging to a group, which is denominating, distinguished, and influential, become natural leaders with the capacity of persuading others. Brown et al. [28] present the results of a two-stage study aimed at investigating traditional communication in social networks; their results suggest that traditional models may be less suitable for these case studies.

In W-bSN, particularly Twitter, we can also find outstanding users in some areas and with some followers who maintain a more or less permanent interest in their ideas or opinions. These users become participatory entities, mentioning a user they are interested in and sharing those ideas or opinions with their followers.

Our main interest consists in analyzing the influence patterns among Twitter users and how the users considered as experts in a given field can promote the growth of the number of followers, positioning this last through RTs.

Taking into account the proposal of [29], we assumed that there exist 18 thematic categories of main subjects of interest (art and design, books, business, charity and deals, fashion, food, and drinks, health, holidays and dates, humor, music, politics, religion, science, sports, technology, tv and movies, other news, and other). Then we defined six linguistics values for the number of followers a given user has, as shown in Table 1. In fact, we are using the algorithm proposed in [29] for deciding to which category belongs each tweet too.

As can be seen from Table 1, the “Unknown” linguistic value represents either users who have just created their accounts or users with little activity and almost null attraction of other users to their publications, having less than 1,000 followers. The “Ordinary” users are those who start gaining some popularity and, in consequence, start having new followers who are interested in their timeline and RT them. Concerning the “Outstanding” users, we have defined three different levels based on their number of followers, representing active user accounts that realize diverse posts during the day, getting still more users that decide to follow them and, in consequence, obtaining more RTs than the “Ordinary” and “Unknown.” According to our result, we could infer that these kinds of users have opinions that are respected so that their influence can be known as important. Finally, for “Famous” users we consider those users who have a big number of followers. Some users of this type are @katyperry (95,462,792), @justinbieber (91,380,536), @BarackObama (83,150,841), and @youtube (66,255,785), to name some.

Currently, Twitter has 313 million active users approximately; the percentage of each mentioned linguistic value is still not known for sure, although one calculates the accounts like “Famous” being minimum. For that, we obtained a sample of a million random users to figure out the proportions of the accounts; then the 98% correspond to the “Unknown” linguistic value and 1.53% to “Ordinary” and “Outstanding 1” is 0.35%, “Outstanding 2” is 0.072%, and “Outstanding 3” is 0.040 and just 0.008% correspond to “Famous” accounts. It is a representative sample that allows visualizing the percentage in Twitter.

4. Experiment Design

In this section, we present the design of the empirical study to analyze the importance of patterns through Twitter users and how they can endorse the increase of the number of followers some other users have, by the use of RTs.

Like the first part, we picked a user to be the subject of our analysis, who we will call in the following to be the Root user. Using the Twitter API we extracted all the information about the activity of Root, that is, tweets, RTs, mentions, and new followers. At the beginning of the observation, the Root user had a total of 3.253 followers, and his tweets were classified mainly in the technology category. For this user, since how we create his Twitter account, the growth of followers has a relatively stable behavior, getting an average of two to three new followers per week. These new followers corresponded to the technology category too.

Then we decided to modify his behavior through the diversification of his primary publishing interests and using additional resources to the tweet, such as images, URLs, or videos. The new categories in which this user newly participated included sports and music. We must explicitly mention that, even if this user used to publish mainly about technology, tweets posted in other areas contained no relevant information, so other users ignored them. In this context, tweets having these other categories as the main subject were analyzed by the algorithm [29]. Once we were sure that they corresponded to the desired category, we included a commonly used hashtag (HT) at the moment of the experiment.

This was made with the intention that his tweets were included in the currently existing conversation threads so that users discussing these subjects could see his posts.

Our proposed strategies for followers growth through retweet are as follows:

(1) Changing the normal behavior of the account: that is, the topics were diversified and published keeping other users interested in the timeline and making them share the publications they considered attractive

(2) Including additional content to the tweet such as images, URLs, and videos: we noticed that there was more interest in the other type of content besides the text, supporting growth through a retweet

(3) The appropriate use of the size of the HT that accompanies the content: in this way we can reach users interacting in the topic determined by the HT used

Below are the results of the strategies mentioned above and what effect they had on account growth.

5. Experiment Results

Based on the previous information analysis, we firstly illustrate (Figure 1) the fact that there was a regular behavior of a conventional user during seven weeks () until the slight behavior beginning of the experiment () that corresponds to four weeks, and, then, it was modified that the behavior permanently increases the number of tweets () and diversifies the subjects that the user approaches. The actions change and some new followers by a unit of time (week) grow. During the average corresponding lapse a growth of two to four new followers (dots) per week (triangles) can be observed. Figure 1 presents an extract from the graph that shows the general behavior explained above; Figure 2 depicts the behavior corresponding of to .

In Figure 2, clearly, since the moment the experiment was started () until we finalized it () with duration of fourteen weeks, the behavior regarding the growth of the number of followers changed. The diversification of the topics was extended mainly to the sports, politics, businesses, and other categories.

In Figure 2, the number of new followers increased when Root published a tweet in the categories previously mentioned. In the figure, P correspond to a political category, T to technology, B to business, and O to others. A vertical line headed by the letter of the chosen category represents each time our Root user posted a new tweet on a specific category, and in the graph below we account for the quantity of new followers associated with each tweet. In the background some new users are illustrated at a particular moment of the experiment independently of the tweet they are related to. It can be observed that tweets of the politics category obtained a substantial increase in the number of followers, because users with “Outstanding” linguistic value retweeted the original tweet, and their followers considered it as relevant.

The horizontal line in Figure 2 is the fragment corresponding to Figure 3, where we can observe the correlation that exists between the RT action and the follow action. Hence, some of these last users considered the commentaries of the Root user interesting enough to start following him. Similar to Figure 2, P correspond to a political category, T to technology, B to business, and O to others.

In this context, we can see that each time the Root user tweets were retweeted (in gray), especially by users with significant quantities of followers, the number of new followers increased.

As a collateral result of our experiment, note that not every user having retweeted a given Root user tweet decided to follow him. The RT is essentially an easy way for users to endorse a post they have read and liked to republish for their followers to see. A vote of confidence in the message is mentioned by the authors [30], and the number of retweets intuitively determines the quality of the publication. A greater number of retweets turns out to be more attractive to the user.

These users were reached through a RT by a user they follow, and then they considered the original opinion attractive enough to spread it, but in a first time, they did not consider the Root user as interesting enough a to start a follow relation on him. Of all the users who did a RT the 37.5% decided to initiate a following; it is important to mention that some users began a following to Root and in the end they decided to retire the relation to a Root user.

In Figure 4 we only present those users that followed Root after a RT of Root’s user publications. By using different symbols, we represent the linguistic values of the types of users. The star at the center of the graph represents the Root user. Here it can be observed that, for example, Outstanding 1 (triangle) user produced more new connections (followers) with Root than those generated by an Unknown (circle) user. During the experimentation corresponding to the diversification of subjects we discovered some preferences of the users who interact with the tweets, and due to this, we decided to examine possible strategies for growth. In this regard, we observed that there are two possible slopes through which we can look for a greater amount of RTs: the inclusion of additional resources within a tweet and the determination of the optimal size of the HT. Thus, tweets can be more interesting for other users and can spread through RT or mention to get growth in the number of followers who considered the publications relevant.

The first of these strategies was the incorporation of additional content to the tweet; this can be expressed under the form of image, video, or URL. In Figure 5 we can observe that, depending on the type of resource included in the tweet, the number of RTs obtained an impact. In this way, of relevance order, it is possible to observe that the higher impact is obtained when an URL is attached, when an image is associated, and finally when the tweet is without any resource included. The fact that when a video is included the impact is minimum compared to the other alternatives attracts the attention. We can affirm that we agree with the assertion of Zarrella [31], in which he said that the inclusion of images or URLs attracts more users to a tweet. Nevertheless, according to our results, it is not possible to be confirmed that the inclusion of a video has a real impact on the popularity of the tweet.

The second strategy was to reach a greater number of users through the inclusion of HTs. With this, we obtained that users not belonging to the Root user’s network could discover it through an explicit search of the used HTs. We explored the inclusion of HTs from one to four words.

In Figure 6, the one that had major acceptance was the use of a word and the one that had less RTs was the use of a HT with four concatenated words. In this sense, we disagree with Weng et al. [24] where they mention that including HTs does not matter for a RT, demonstrating that using a HT of proper size is more interesting to the user. After these preliminaries, it is possible to deduce that the simplicity is a key element.

In the previous graphs, we show the growth in some RTs when a HT with one or more concatenated words was included to the tweet; the use of HTs provided to us a major propagation of the tweets. Therefore, we agree and confirm what Page [32] and Chang [33] mentioned, where HT is a necessary key that achieves visibility, becoming a search term for those users interested in tweets about the subject. The results of their analysis also demonstrate that HT is used to reach more people as a means of a search term. Our growth was due to the diffusion of tweets including content that helps us to reach those interesting users with linguistic value “Outstanding,” and they considered the tweets as relevant to sharing them with their followers. The strategies mentioned earlier were carried in the Root user, and to confirm the suggested approaches we extended our analysis to twelve random users (a1 to a12), two by each linguistic value. The accounts that have a behavior considered as normal mainly publish tweets on the same categories and rarely include additional content; the growth in the number of followers in these accounts stays low. When the behavior changes permanently and when HTs are added, the content has a representatively high growth in the number of followers.

Table 2 shows the results of the extended analysis corresponding to a month, where the users with linguistic value “Famous” usually have a steady growth behavior due to the number of followers they have. In another case, users who have a normal behavior without additional content in the tweets did not have a significant growth and in two cases (in Italic font) they lost followers. Particularly, users a3 and a5 began to include more additional content in the tweets, mainly images and URLs.

The main aim of our experiments was to explore different approaches for understanding the behavior of followers growth and the interest in the content.

6. Conclusions and Future Work

Considering that Twitter is a social network with short messages of at most 140 characters long, users need to be concise. For this reason, it is necessary that the tweets become interesting enough for other users. Although one knows that the best way to reach more users is through diffusion of tweets, by mention or RT, it is not so evident how to get, the reason why our proposal presents some strategies to publish more interesting tweets to engage users classified like “Outstanding” or “Famous.” Our results show that the diversification of the topics and the inclusion of HTs with a suitable size achieve a greater number of users outside the network of followers, making it possible to reach those users considered as influential. The goal of our investigation was to demonstrate the growth through time, by means of the tweet based on strategies of inclusion of content to reach users with linguistic values “Outstanding” and “Famous” that, when shared with their users, add a value of interest to publications. The proposed strategies can be used on any type of account, not mattering if it is personal, brand, humor, and so on. Our investigation offers guidelines to continue with the study of the influence of the users and criteria of growth.

As future work, we considered focusing on analyzing particular behavior of “Outstanding” and “Famous” users and thus defining measures of prestige and validity in the content that they publish.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.