#### Abstract

*Purpose*. The purpose of this article is to predict the topic popularity on the social network accurately. Indicator selection model for a new definition of topic popularity with degree of grey incidence (DGI) is undertook based on an improved analytic hierarchy process (AHP). *Design/Methodology/Approach*. Through screening the importance of indicators by the deep learning methods such as recurrent neural networks (RNNs), long short-term memory (LSTM), and gated recurrent unit (GRU), a selection model of topic popularity indicators based on AHP is set up. *Findings*. The results show that when topic popularity is being built quantitatively based on the DGI method and different weights of topic indicators are obtained from the help of AHP, the average accuracy of topic popularity prediction can reach 97.66%. The training speed is higher and the prediction precision is higher. *Practical Implications*. The method proposed in the paper can be used to calculate the popularity of each hot topic and generate the ranking list of topics’ *popularities*. Moreover, its future popularity can be predicted by deep learning methods. At the same time, a new application field of deep learning technology has been further discovered and verified. *Originality/Value*. This can lay a theoretical foundation for the formulation of topic popularity tendency prevention measures on the social network and provide an evaluation method which is consistent with the actual situation.

#### 1. Introduction

On a social network platform, the user’s behaviours such as posting or reposting or commenting on the content of a certain post may be related to one or more topics. The degree to which a topic is concerned by users is called topic popularity.

Recently, researches pointed out that the time sequence of topic popularity in the social network can be used to predict the development of topic tendency [1, 2]. Existing topic popularity predictions are mainly based on records of one indicator which is saved as historical data for linear regression and other methods. The indicator of the user’s behaviours related to the topic popularity that changes over time can be regarded as time series. Research has shown that records of these indicators can be beneficial, provided that there are the numbers of reposts, comments, likes, and so on [3, 4]. Moreover, research found that topic popularity was affected by a combination of indicators [5, 6]. In recent years, some scholars have proposed the definition of topic popularity based on comprehensive multiindicators, but the influence of their weights on the popularity has not been studied [7],. and algorithms for topic popularity perdition are mainly based on the historical series data employing linear regression analysis methods. The rapid increase in data volume has led to the problems of inaccurate prediction results and poor system stability in traditional prediction systems for a long time [8]. Therefore, it is necessary to analyse historical topic popularity with many other indicators, especially the time series indicators, and to use nonlinear analysis methods to extract key indicators that affect topic popularity so as to predict the future development of it.

In order to make an appropriate and effective assessment, an indicator selection hierarchical structure of AHP for topic popularity prediction is applied. Therefore, in this paper, an indicator selection model is proposed with AHP for topic popularity definition. After that, this new definition is used in topic popularity prediction process which employs the deep learning algorithms for popularity series forecasting. In this way, the accuracy of forecasting results by three kinds of deep leaning models, which are popular and good at time series processing, is used to screen and weight the indicators.

The contributions of this paper are summarized as follows. In the second section, topic popularity is firstly defined with the concept of DGI. After that, the basic procedure of AHP is introduced for indicator selection for topic popularity and improved with the deep learning prediction models used mostly in time series problem. The third section explains the AHP-based popularity indicator extraction model proposed in this paper. The fourth section uses three kinds of most popular deep learning algorithms for topic popularity prediction and conducts experiments and analysis. The last section explains the conclusions and points out the future research directions of this article.

#### 2. Preliminaries

Indicator analysis of topic popularity is used to evaluate the importance of the indicators that affect topic popularity.

##### 2.1. Topic Popularity Indicator Selection

The core and goal of big data mining on social network is prediction. Establishing an indicator selection system is the basis for topic popularity prediction, so it is necessary to carry out topic popularity prediction based on monitored data and dynamically modify the prediction results with the addition of new data to achieve dynamic prediction of topic popularity. At present, the indicators that measure the popularity of social network topics are related to those whose values could intuitively rank the topic popularity by users’ behaviours, which objectively reflects the tendency of social network topic popularity. Current research on topic popularity prediction is based on single or multiple indicators of topics. Wu and Huberman [9] defined the popularity of each post on the “Digg.com” website by counting the number of votes, through which the readers expressed their attitudes towards it. Their research found that the popularity followed a logarithmic normal distribution. In addition, Wu et al. [10] further studied the content decay law from the perspective of the number of users’ comments and found that both the time interval between two consecutive comments on the same content and the frequency of comments on a topic obeyed a power law distribution. With the advancement of data mining technology on social network, Ratkiewicz et al. [11] studied the popularity of Wikipedia and counted the link penetration of nodes and found that it changed with time and it was more obeying the power law distribution. Lerman and Hogg [12] proposed to use random state transitions to represent users’ registration, readings, likes, and behaviours of topic publishers or their friends on topics in social network. They assumed the number of votes on a story accumulates on Digg as its popularity and modelled the popularity of a post on the Digg as the independent variable and the number of likes received by the post at time *t* as the dependent variable. The rate equation of is as follows:where rate means the probability at which a user seeing the story will vote on it, and , , and are the rates at which users find the story via one of the front or upcoming pages and through the friends interface, respectively. These parameters are the empirical values trained from the training set. They found that when the users’ behaviours (such as the number of likes) of the post is obvious enough, the topic post will be pushed to a page that can be seen by more users. The more likely its popularity is to increase; otherwise, the less popular the post will be and it may be gradually dissipated in social networks.

##### 2.2. Basic Procedure of AHP

AHP is a multiobjective decision-making method that is practically applied in many engineering fields [13, 14]. It is a systematic analysis method combining qualitative analysis and quantitative analysis which was first proposed by Saaty [15]. Based on some defined criteria and procedure (see Figure 1), AHP technology mainly compares various indicators at the same level in complex problems in pairs and aims to determine the degree to which one alternative outperforms the other [16]. The basic procedure of AHP is as follows: *Step 1.* Identify the problem and establish a hierarchical structure model. By clarifying the scope of the problem, the specific requirements, the contained elements, and the relationship between each element, a complex multiple criteria decision-making (MCDM) problem is broken down into a hierarchy of interrelated decision elements according to the characteristics and general objectives of the problem. Generally, it is divided into three levels, which are the target layer at the top, the multiple criteria layer in the middle, and the alternatives layer which displays possible solutions or measures at the bottom. *Step 2.* Build a judgment matrix. Judgment matrix is the basis for relative importance calculation and hierarchical ordering. Based on an element *C* of the previous level as an evaluation criterion, the elements of each level are compared in pairwise to calculate their relative importance so as to determine the judgment matrix [16]. The elements to be compared and judged must have the same properties and be comparable. Based on *n* criteria, it requires *n*(*n* − 1)/2 pairs of comparisons, where *N* = [1, 2, …, *n*] is the number of elements. Let be the set of criteria and a (*n* × *n*) evaluation matrix *A* be the results of pairwise comparisons which is based on *n* criteria. The element *a*_{ij} in the evaluation matrix represents the relative importance of the element *a*_{i} to *a*_{j} according to the evaluation criterion *C*. The value of *a*_{ij} is determined after repeated researches on the data, expert opinions, and the experience of the evaluation subject. Each element is a quotient of weights of the criteria, as shown in In AHP, multiple pairwise comparisons are based on a standardized comparison scale of nine levels as indicated in Table 1. *Step 3.* Check the consistency.

In AHP, the indicator for judging matrix consistency is shown in the following equation: where *λ*_{max} is the maximum eigenvalue of judgment matrix *A* and *n* is the order of judgment matrix *A*. *Step 4.* Calculate the weight of each evaluation and make the decision. The consistency ratio (CR) can be used to infer whether the assessment is sufficiently consistent, calculated by dividing the consistency index (CI) by the random consistency index (RI), as shown in the following equation:where RI is computed from Table 2 based on total number of random samples [17].

The greater the value of the CI, the greater the degree to which the judgment matrix deviates from full consistency. For verifying the calculated weights, Saaty [18] suggested that the value of CR should be less than 0.10 (maximum threshold). In the current work, CR values greater than 0.10 are rejected and a new pairwise comparison judgment is required. Finally, decision is made based on the normalized values mentioned above.

Research has also shown that the timing of topics was very important for their popularity, especially the social types of users [19]. By analysing and extracting user features and text features of participating topics, the method of machine learning can be used to forecast the influence of Weibo topics [20].

Recent advances in deep learning, especially the memory function of RNN on previous output sequences [21], provides some useful perspectives on how to solve time series problems. Zhu et al. [22] proposed an RNN opinion dynamics model for the prediction of each user’s posting behaviours on Twitter and used an attention mechanism to predict user-level positions and merged the context of the neighbors’ topics into a signal of interest. Although their approach can also be used to predict the topics from user-level, the RNN does not take the time sequence correlation from topic-level into consideration. According to the basic principles of deep learning methods, what is needed for solving the training problem is a reasonable input-to-output model and suitable amount of data for learning [23].

##### 2.3. RNN Prediction Method

RNN is used for mining data for deep representation of time series information and semantic information [24]. It is commonly used in speech recognition, language modeling, machine translation, and time series analysis. The difference between RNN and ordinary fully connected neural networks is that the nodes between the hidden layers of RNN are connected. The input of the hidden layer includes the output of the input layer and the output of the hidden layer at the previous moment. A recurrent neural network is essentially a type of neural network with an internal loop. The way it processes sequences is to iterate through all the elements in the sequence and retain some state information related to the viewed content. The structure of RNN employed in this study has two inputs *x*_{1} and *x*_{2}, two hidden layers *h*_{1} and *h*_{2}, an output layer *Y,* and the weights and biases *b*_{1} and *b*_{2} on them (see Figure 2).

Suppose that *h*_{1} and *h*_{2} are used to represent the *i*th hidden layer in the hidden layers, and are used to indicate the weights for input nodes *x*_{1} and *x*_{2}, and *b*_{i} is used to represent the offset term coefficient corresponding to the *i*th output node, then the output of the prediction model is shown in (7) below:

##### 2.4. LSTM Prediction Method

LSTM is a type of RNN. Unlike traditional feedforward neural networks, RNN is a time series-based model that can establish the time correlation between previous information and the current environment. LSTM, proposed by Hochreiter and Schmidhuber [25] in 1997, mainly solves the problem of data classification and is applied to many aspects such as natural language processing, image subtitles, and speech recognition [26]. Since it can perfectly simulate the problem of multiple input variables, it can also be used for time series prediction [21].

LSTM adds a memory unit dedicated to storing historical information. The schematic diagram of the LSTM memory unit is shown in Figure 3, where is the hidden state vector for the current moment *t* while is the candidate state obtained with a hyperbolic tangent. is the current input vector. Historical information is updated through the control of three gates: input gate, forget gate, and output gate.

In this paper, it means popularity indicator or output vector of previous layer. “” and “” represent separately for sigmoid function and hyperbolic tangent activation function. “” stands for the element-wise multiplication. , , , , and are the input gate, forget gate, output gate, new memory cell, and final memory cell used in the LSTM model. Topic popularity time sequence is used as an input of the model. Its model is defined by the following equations (9)–(15) (for simplicity, the layer index *l* is omitted):

##### 2.5. GRU Prediction Method

The structure diagram of a GRU node is shown in Figure 4. Its model is defined by the following equations (16)–(20) (for simplicity, the layer index *l* is also omitted) [21]. , , and stand for reset gate, the candidate state obtained with a hyperbolic tangent, and update gate. The other symbols are basically the same as those used in the former two models.

Given that indicator sets are formed by experts, decision makers may need to analyse large amounts of data and consider many indicators [7].

#### 3. Topic Popularity Prediction Model

In this paper, the topic popularity prediction model considers two aspects, including topic popularity indicators analysis and topic popularity prediction (see Figure 5). Combining the prominent characteristics of the popularity of social network topics and the fact that exists in many social networks at present [4, 6, 27], the indicators which influence the topic popularity can be roughly divided into six indicators as follows. There is no overlap or intersection between levels, and the measurement scheme of each evaluation indicator is easy to implement and has strong operability [6].

NR (number of retweets): users on the social network can forward a topic to their personal homepages and comment on it. It refers to the number of times a topic has been retweeted during the period from the moment it is published to the moment it is measured.

NC (number of comments): it refers to the number of readers’ opinion or reaction to the content of the topic, which is expressed after it is published.

NL (number of likes): it refers to the number of times a topic is clicked as “Likes” by a network user from the moment it is published to the moment it is measured.

NV (number of views): it refers to the number of times that a topic is read by users from the moment it is published to the moment it is measured.

NE (number of related entities): it refers to the number of the tags of the entities or people’s names mentioned from the moment it is published to the moment it is measured in a topic.

NF (number of favourites): it refers to the number of times a topic is favorited from the moment it is published to the moment it is measured by a network user.

##### 3.1. Definition of Topic Popularity in Social Network

In social network, the concept of topic heat refers to the comprehensive value of indicators of network user’s behaviour in the dissemination of topic’s information. The calculation method for topic heat at the moment *t* is shown aswhere the , *i* = 1, 2, …, 6, stands for different weights for each indicator. Table 3 shows data of topic heat based on users’ behaviour indicators.

In order to make the data comparable and reliable and to eliminate the difference of the variation between indicators, we adopt the range normalization procedure on the original data of topic’s indicators. In order to control the topic heat to [0, 100] after this normalization, the value is multiplied by 100 to obtain the topic heat *H*_{t}, and it is calculated from equation (22) as follows:where and respectively represent the maximum and minimum values of topic heat in the original data.

Let *H* be the time series vector of topic heat, then

Since the popularity of social network topics presents a discrete state over time, first, according to the popularity vector obtained by equation (23), the acceleration value of the topic heat at each moment is calculated as

Considering the topic heat acceleration as a time series, let Δ*H* be the heat acceleration vector,

Suppose there are a total of *n* topics in the period of [0, *T*], then the heat tendency of topic at time *t* is composed by the current and the heat acceleration , as shown in Figure 6:

Here, *α* = 0.4 is used [28]. Among them, the definition of is calculated by the degree of grey incidence (DGI) [29] according toin which *ρ* ∈ [0.1, 0.5). Thus, at the moment *t*, the popularity of the *i*th topic on the social network is shown as follows:where the stands for different weights for each indicator.

##### 3.2. AHP for Topic Popularity Indicator Selection

(1)A hierarchical structure model is established based on the definition of topic popularity. As shown in Figure 6, the indicator selection model for topic popularity prediction is divided into three layers, of which “*Goal*” layer represents the evaluation target layer, that is, the indicator selection for topic popularity prediction; “*Indicators*” layer represents the indicator selected from the literatures, which includes “NR,” “NC,” “NL,” “NV,” “NE,” and “NF”; the prediction results of the deep learning prediction models on the topic indicators are employed as the screening principle for the topic popularity indicators. Deep learning models train and predict the popularity on each indicator and select the indicators with the highest accuracies among the prediction outputs.(2)The data of the judgment matrix comes from the questionnaires of 10 experts and professors engaged in big data mining using deep learning algorithms in social networks for several years. When they compare the indicators and criteria and strategies, they use their perception and experiments, alongside with the objective facts at their disposal. For example, 2.08 in Table 4 indicates that the average score of the experts for the importance of *i*1 to *i*2 is 2.08. Judgment matrixes of the strategic layer relative to each criterion are shown in Tables 5–10.(3)Calculate the parameters according to formulas (3) and (4) (see the textbook [30] for the specific process). Table 11 shows the results obtained from AHP on topic attributes, where the ranking is “NR” > “NC” > “NF” > “NE” > “NV” > “NL.” Table 12 presents the results obtained from AHP on prediction models with indicator NR, where the ranking is “GRU” > “LSTM” > “RNN.” Table 13 shows the results obtained from AHP on prediction models with indicator NC, where the ranking is “GRU” > “LSTM” > “RNN.” Table 14 presents the results obtained from AHP on prediction models with indicator NL, where the ranking is “LSTM” > “GRU” > “RNN.” In Table 15, the results are obtained from AHP on prediction models with indicator NV, where the ranking is “GRU” > “LSTM” > “RNN.” In Table 16, the results are obtained from AHP on prediction models with indicator NE, where the ranking is “GRU” > “LSTM” > “RNN.” In Table 17, the results are obtained from AHP on prediction models with indicator NF, where the ranking is “GRU” > “LSTM” > “RNN.” All these CRs are below 0.10. Based on these, we calculate the CI for each one of the tables above and proceed to measure the total consistency for the popularity prediction problem. For Table 11, it is equal to 0.0713; while for the other six tables, they amount to 0.0024 (Table 12), 0.0064 (Table 13), 0.0399 (Table 14), 0.0035 (Table 15), 0.0037 (Table 16), and 0.0026 (Table 17). The overall consistency index is 0.0399.(4)The deep learning prediction algorithms are applied for indicator selection, choosing the indicators that have a greater impact on the topic popularity. The overall priority of alternative *S*_{i} equals the sum of the strategies of each priority multiplied by the individual priority of alternative *S*_{i}. For example, in Table 18, the overall priority for *S*_{1} equals

Similarly, all other priorities are calculated in this way. Therefore, we have priority for RNN = 21.5%, priority for LSTM = 36.0%, and priority for GRU = 42.5%.

The calculation process is as follows. Firstly, these three algorithms of deep learning are used to predict the topic popularity time series, respectively. Secondly, the prediction results are weighted to obtain the final prediction value. Thirdly, the index with the prediction accuracy of the previous 85% of the index set is picked up as the final indicators for popularity definition [13].

#### 4. Experiment Results and Discussion

Using these deep learning models, we can predict the topic popularity on online social network at the next moment. In addition, the accuracies of the three models of deep learning for predicting topic popularity are compared, including predictions on the original attributes and the novel definition of popularity with DGI.

##### 4.1. Data Collection

Topic popularity data sources used in this study include the numbers of “retweets,” “comments,” “likes,” “views,” “entities,” and “favorites” mentioned above as in Table 4. These are used to study the accuracy, resolution, and characteristics in topic popularity prediction. In this research, we design an evaluation questionnaire to compare the importance of various indicators of topic heat on topic popularity and the comparison of the prediction accuracy of RNN, LSTM, and GRU in each indicator.

The dataset for this experiment was collected from some topics on the Hot Topics List of Weibo at 6 pm, 2020.04.21, to 6 pm, 2020.04.24, and crawler technology was used to collect the numbers of “retweet,” “comments,” “likes,” “views,” “entities,” and “favorites” every 5 minutes. For this synthetic dataset, we use a generation process similar to that described in [7]. We use 12 different topics as the training dataset and collect a total of 62208 sequences. All data instances in the dataset are 20 moments long (10 moments for the input and 10 moments for the prediction).

##### 4.2. Experimental Case

This study mainly includes four experimental cases. The first three experiments are all popular algorithms used for time series in deep learning, and the last one is a case of AHP-based indicator selection method using deep learning method proposed in this paper, as shown in Table 19.

##### 4.3. Evaluation of Topic Popularity Prediction

In order to demonstrate the accuracy of the proposed prediction model, the actual value *h*(*x*_{i}) and the predicted value can be compared. Equation (30) is the mean absolute error (MAE) used to examine the accuracy of deep learning models in this paper.

The performance of the three deep learning models is shown in Table 20, where the “Popularity” means the definition proposed in this paper and the bold text indicates the smallest prediction error among the error values in each row and column. Here, the accuracy is calculated as follows:

Prediction experiments on topic popularity are conducted based on three different deep learning models and the AHP-based model mentioned in this paper. Comparison of accuracy of experimental results is shown in Table 20, where case nos. 1 to 3 are basic RNN, LSTM, and GRU, which use the “linear” activation and 5000 iterations at each experiment on a topic. No. 4 is the indicator selection model proposed in this paper. According to the experimental results, the improved AHP indicator selection model proposed by this research has the highest prediction accuracy, with an average accuracy of 97.66% and the average accuracy of experiments 1 to 3 are, respectively, 91.30%, 94.49%, and 94.44%. In particular, experiment on the proposed popularity definition improves the prediction accuracy up to 98.96%, as shown in Table 20.

#### 5. Conclusions and Future Work

In this paper, we have successfully applied the deep learning approaches to the challenging topic popularity prediction problem based on AHP which so far has not benefited from sophisticated deep learning techniques. We made a qualitative research on the influencing indicators of social network topic popularity in the context of big data and build an indicator selection model for improving comprehensive evaluation of AHP by deep leaning approaches, especially RNN, LSTM, and GRU. By defining the maximum correlation vector on the indicator, the definition of topic popularity is built quantitatively based on the DGI method, and different weights of topic indicators are obtained from the help of AHP. Some new indicators are used in this research, including the indicators of the number of views, the number of entities mentioned or associated with a topic, and the number of favorites. The average accuracy of experiments can reach 97.66%. The comparison of experiments shows that using the prediction accuracy of deep learning algorithms as the indicator screening principle for AHP, the training speed of the topic popularity prediction model is faster and its prediction accuracy is higher. The indicators mentioned in the previous literature can well predict the development tendency of topic popularity. According to the topic popularity defined by DGI and using the prediction algorithm weight provided by AHP to give the prediction value, the prediction accuracy obtained is better. This method is superior to those prediction models established without screening and is closer to the actual measurement. The predicted value is closer to the actual value, which plays a certain reference role for the supervision and control of the topic popularity tendency.

For future work, we will explore different kinds of platforms from social networking sites, exploit community analysis algorithms, and parameter adjustments for topic popularity prediction.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported in part by the Educational Commission Project of Fujian Province, China, under Grant JAT170327 and in part by the Natural Science Foundation of Fujian Province, China, under Grant 2018J01791 and Grant 2018J01539. The authors sincerely thank the experts who participated in the questionnaire for their work and concern.