Abstract

Following the outbreak of a disease, panic often spreads on online forums, which seriously affects normal economic operations as well as epidemic prevention procedures. Online panic is often manifested earlier than in the real world, leading to an aggravated social response from citizens. This paper conducts sentiment analysis on more than 80,000 comments about COVID-19 obtained from the Chinese Internet and identifies patterns within them. Based on this analysis, we propose an agent-based model consisting of two parts—a revised SEIR model to simulate an offline epidemic and a scale-free network to simulate the Internet community. This model is then used to analyze the effects of the social distancing policy. Assuming the existence of such a policy, online panic is simulated corresponding to different informatization levels. The results indicate that increased social informatization levels lead to substantial online panic during disease outbreaks. To reduce the economic impact of epidemics, we discuss different strategies for releasing information on the epidemic. Our conclusions indicate that announcing the number of daily new cases or the number of asymptomatic people following the peak of symptomatic infections could help to reduce the intensity of online panic and delay the peak of panic. In turn, this can be expected to keep social production more orderly and reduce the impact of social responses on the economy.

1. Introduction

Complexity science is a field of nonlinear science research, e.g., the theory of chaotic dynamics, which contends that simple behavior on an individual level may lead to uncertain complex behavior in aggregate. With advancements in computing power over time, the functions of cellular automata have been empowered, pushing such research into a broader range of fields. In the field of management and social research, Schelling (1971) concluded that simple rules can simulate complex changes and explained the segregation phenomenon using certain move strategies and two types of agents [1]. In such models, the cell is regarded as an agent, and the model in its entirety is called an agent-based model (ABM). The visibility of an ABM model is enhanced by the standardized description of the mapping relationship between each agent’s real behavior and abstract mathematical rules [2]. In recent years, a significant amount of data has been generated from unstructured textual data, which is of considerable informational value. Rapid development in natural language processing technology and computing power has enabled us to obtain useful information from them. Khatua, Khatua, and Cambria (2019) used PubMed abstracts in a pretraining model and constructed high-quality quantifier vectors in specific fields, which improved the accuracy of epidemic monitoring based on social media texts [3]. Sentiment analysis, also known as opinion mining, is another important natural language processing technique. It is a data-based method used to analyze emotional tendencies. For example, emotion analysis technology can be used to analyze the impact of terrorist attacks on people’s psychology [4]; to predict, monitor, and analyze public opinion on political issues by the government [5]; and to estimate user satisfaction with certain products or services by enterprises [6]. Textual sentiment analysis is beneficial even to researchers, helping the design of more reasonable models and more realistic investigation of the complexity of phenomena.

The objective of the current study is to investigate patterns in the transmission of online panic and to minimize its adverse effects. Panic refers to a special form of collective behavior that occurs when a group subjectively believes that resources are scarce, and it is usually accompanied by maladaptive behavior [7]. In particular, people experiencing panic during an epidemic are liable to act irrationally [8]. In academics, this phenomenon is called a social response. Considerable research has been conducted on epidemiological prediction based on social media data [9] or social responses caused by epidemics [10]. Social responses to epidemic outbreaks range from mental stress and economic downturn to flight from the outbreak site and distrust of official announcements [11]. Fraud, theft, robbery, and other disruptions of social order are known to become common and epidemic prevention orders are often ignored [12, 13]. In the financial market, investor sentiment significantly influences the stock market [14]. The social response to an epidemic affects the stock market, the retail sector, and individual incomes and eventually induces an economic recession. Following the COVID-19 outbreak, the imposition of social distancing policies and international travel restrictions severely affected global economic activity [15]. The International Monetary Fund estimated that the average economic growth rate in advanced economies is −4.9% in 2020 due to the impact of COVID-19. The epidemic has also affected developing economies. The average growth rate in these countries was −2.4% in 2020 [16].

Online panic is characterized by its rapid transmission and is not geographically constrained. These escalate its impact, causing severe economic losses. On April 23, 2013, fake news reports claiming that Barrack Obama was injured in twin explosions in the White House were spread on Twitter, inducing a loss of $ 136.5 billion in the stock market [17]. Online panic can also lead to health problems—people in Iran were misled to believe that drinking alcohol can prevent and treat novel coronavirus, leading to the deaths of hundreds of people in several provinces of Iran [18]. Nicomedes and Avila found levels of anxiety regarding health to be consistent irrespective of the location of individuals or their exposure to COVID-19 patients [19]. Rumor is an important source and vector of online panic, and several researchers have attempted to explain its mechanism [20, 21]. Following the COVID-19 outbreak, the relationship between information and online panic, leading to what is known as an “infodemic,” has been studied [22]—the flow of information leads to anxiety and caution, while misinfodemics cause panic, distrust, and confusion [23]. Ahmad and Murad investigated the relationship between social media and the transmission of panic regarding COVID-19 on the basis of questionnaires and identified fake news about COVID-19 and the dissemination of the number of infections to be the two primary contributors [24]. Panic buying (PB) is another important topic of research—online panic often leads to impulsive and obsessive buying, whose negative aspects have been extensively portrayed in the media [25, 26].

Existing studies reveal that sudden outbreaks of diseases often lead to panic, resulting in severe consequences. Recently, online panic following the COVID-19 outbreak has been investigated. Online networks have a special network structure characteristic—that of scale-free networks. This implies that the transmission of panic on the Internet is different from that in real-world environments. As epidemic information plays an essential role in the spread of online panic—transparency on this topic is a double-edged sword. In turn, this increases public distrust of the government. It also triggers panic on the Internet and affects the normal functioning of society and the economy. In this paper, we consider the differences between the transmission of panic in the information age compared to those in earlier times and explore reasonable steps of information disclosure to control panic effectively. Existing studies have used empirical or modeling-based methods to study the damage caused by the transmission of diseases. Fast et al. used an agent-based model to analyze the differences in social responses caused by several epidemics [27]. However, their model assumed the degree distribution of the relationship network to be uniform and did not consider the role of the Internet. Other studies have used questionnaires and regression methods to analyze the reason or impact of online panic without exploring its mechanism. The transmission of diseases and that of online panic should be investigated in an integrated fashion to guide information release policies. The model proposed in this paper is based on this outlook.

2. Materials and Methods

In this study, a dual network model, accounting for offline disease transmission and online emotional transmission, is proposed. This model is applied to investigate the relationships between public disclosure of health information, dissemination of information to mass media, and public perception of the risk of disease. Further, the impact of variations in the parameters on the evolution of online panic is measured.

2.1. Social Media Sentiment Analysis

SnowNLP is a simplified Chinese text processing toolkit that can be used to assign emotional intensity scores, ranging between 0 and 1, to Chinese texts. It employs word segmentation and trains a naive Bayesian model to perform emotional analysis on new texts.

in (1) denote two separate categories, and denote the features. The model was trained using a training dataset and topical data about COVID-19 collected from blogs and comments published on posts on Weibo (a Chinese social media platform, similar to Twitter) between January 20, 2020, and April 1, 2020. The total text volume was 80,235. Inspired by Xiong et al. (2020), negative comments were defined to be those with an emotional score below 0.1, while positive comments were defined to be those with an emotional score above 0.9 [28].

Tables 1 and 2 record the average emotion scores and the proportion of negative comments. The average emotion scores were observed to lie within the range between 0.368 and 0.684, and negative comments accounted for a maximum of 29.15% of the total number of comments. As the number of confirmed cases reported increased, negative comments were observed to increase in proportion, rising by a factor of 1.97 between January 21 and February 4. This indicates that an increase in the number of reported cases per day sours the mood of online discussions, while a decrease in the number of reported daily cases improves the mood of online discussions. Before the 36 new cases on March 31, the daily number of newly reported cases was less than 10. The sudden spike in the number of reported cases on that day led to a rapid increase in the proportion of negative comments on the Internet.

Figure 1 depicts the emotional distribution of the topics recorded in Tables 1 and 2. The emotional distribution exhibited the trend of a bipolar distribution, and neutral views were observed to not hold much sway. This can be attributed to the young demographic of the Weibo user base and the degree of anonymity that users enjoy on the platform. In addition, whenever the number of reported cases was higher than that of the previous day, the proportion of positive comments (emotion score > 0.9) was observed to decrease significantly, while that of negative comments (emotion score < 0.1) increased significantly, even when there was no significant difference in the number of new cases over a longer period, the total number of cases, the number of asymptomatic infected persons, and other forms of reporting. Thus, it is clear that netizens primarily focus on the comparison between the number of reported cases each day and the corresponding previous day but do not consider the statistical caliber used. This analysis helped us to establish the following rules for the proposed model.

2.2. Attributes Based on Epidemiological Dynamics

Several studies have been conducted on epidemics. Kermack and McKendrick (1927) proposed the SIR model, which divides the population into three categories—susceptible, diseased, and recovered [29]. It is important to note that the SIR model assumes that once infected, patients cannot be infected again—thus, infected patients will either recover or die from the disease. Subsequently, Anderson and May (1992) proposed the SEIR model based on the SIR model [30]. In this model, people are categorized into four classes—susceptible, exposed, infectious, and recovered. Consideration of the scenarios in which patients infected with epidemic diseases do not develop immunity after recovery led to the proposal of the SIS model. Further, diseases may exhibit rapid mutation, leading to short-term immunity of infected and recovered patients but renewed susceptibility over longer terms. The SIRS model was proposed to account for this scenario.

COVID-19 patients exhibited weak infectious ability during their incubation periods [31]. Thus, based on the SEIR model and existing studies, we propose a modified SEIR model with the following parameters: Xi denotes the health status at time t and Xi(t){S, E, I, R}, where S, E, I, and R denote classes of susceptible, exposed, infected, and recovered individuals, respectively. People travel between the four states over time. Most epidemics are transmitted primarily through close contact between people. Let us suppose that each person comes into close contact with people at each time step. Let the probability of infection by contact with an exposed person, which includes patients in the incubation period, be , and let the probability of infection by contact with an infected person be . Further, let the probability of recovery for an exposed or infected person be . It is to be noted that this includes patients who die of the disease. All recovered persons are considered to be no longer infected. Finally, let the probability of an exposed person becoming susceptible be . This accounts for the case in which an asymptomatic person turns into a patient with definite symptoms. Given these parameters, the epidemiological dynamics equations are as follows:Susceptible:Exposed:Infected:Recovered:

The correlation between health status and online social relationships is depicted in Figure 2.

2.3. Netizen Attributes

Netizens are assigned health attributes, Xi(t), and online panic attributes, Yi(t). Yi(t) is used to denote the intensity of the online panic of an agent at time t. Yi is a continuous variable, Yi[0, 1], where 0 represents the absence of all panic and 1 represents the most severe level of online panic. Yi[0, 0.1] in the initial state.

Firstly, when the public is in a rational state of mind, they are willing to listen to public health sector guidance and trust officially released information. Popular judgment of dangers, fears, and proportionate responses are determined to a large extent by their trust in the public health sector. Secondly, on the Internet, panic is not geographically constrained and its communities are prone to cross-regional transmission. Thus, the following three rules are adopted to adapt the rules of transmission while modeling online panic:(1)When communicating with neighbors, individuals are more susceptible to the emotions of the most fearful neighbors(2)Whenever the media reports disease-related information, popular panic increases with a certain probability, increasing the values of Yi(3)When a node is infected, panic increases rapidly, Yi = k. k is a constant that depends on the severity of the disease, e.g., the number of deaths from the disease and the existence of a sequela of the disease

Psychological studies have established psychological trauma as well as information anxiety decrease over time with a fixed coefficient of α = 0.95. At the end of each round of network evolution, each individual’s panic is attenuated, and the reduced panic emotion is taken as the initial value of the next round. The following formula is used for this purpose:

Internet users tend to be convinced by users with radical views and empathize with users with strong emotions, as confirmed by research on the transmission of word-of-mouth on the Internet. Mudambi and Schuff (2010) found that users considered extreme emotional comments to be more useful than more moderate comments [32]. Inspired by the DeGroot model and cognitive psychology [33, 34], we contend that individual panic can influence neighbors in a biased manner. In particular, the online panic of each node in the network is a weighted average that is influenced by its neighboring nodes. Agents exhibiting higher panic are assigned higher weights, as follows:where denotes the influence weight of the ith node on all neighboring agents at time t and denotes the intensity of the online panic of agent i at time t.

In addition, we define the following rules of influence of online panic among individuals as follows, where denotes the existence or absence of a connection between node i and node j = 1 implies the existence of a connection between the two nodes:

2.4. Network Attributes

A scale-free network is a type of complex network. Such networks utilize two assumptions for the formation of scale-free networks—the growth hypothesis and the preference connection hypothesis. This means that new nodes continually appear in the network and the new nodes are more inclined to connect with existing nodes with higher degrees. The degree distribution of such a network follows the power distribution, , where k lies between 2 and 3 [35]. Li et al. (2015) identified the invariant characteristic that the followers’ count of users obeys a power-law distribution with an exponent almost equal to 2 by empirically studying 10 million user profiles on the largest Chinese microblog, Sina Weibo, and 41.7 million profiles on Twitter [36]. Moreover, rumors are known to spread faster in scale-free networks, which are also called BA networks, than in small-world networks [37].

In this study, the NetLogo software is used to construct the model and perform the simulation. Individuals are taken to be the agents. Connections between agents are determined by the online connection between the two corresponding people, which is mutual and reflected in the network as an undirected connection between the two points. As depicted in Figure 3, according to these rules, a scale-free network containing 30,000 nodes is constructed in this study, and its degree distribution is observed to follow a power-law distribution with an exponent of 2.

2.5. Attributes Related to the Disclosure of Public Health Information and the Media

Information conveyed by the media to the public has an essential impact on public sentiment. For example, public fear of being in a plane crash is much higher than the probability of developing heart disease, even though, in reality, flight safety is much higher than immunity from heart disease. Young et al. (2013) concluded that diseases that are reported more frequently in the media attract significantly more public attention irrespective of their severity [38]. Disclosure of information by the government also affects the judgment of the media. The government can choose to publish the number of new cases recorded on the previous day and the total number of cases on the previous day or not to declare the number of asymptomatic infected persons. When published data exceed half of the all-time high number of infections, disease-related news frequently appears on social media [8]. In our model, we set the initial intensity of social media (M) to be 0 and update it according to the following rule:where k denotes a constant that depends on the severity of the disease, N (t) denotes the number of cases declared by the government at time t, t_0 denotes an instant before time t, and M (t) denotes the intensity of M at time t.

An online individual, I, perceives the severity of the epidemic reported by social media with probability, pi, which is determined by the penetration rate of public media:

3. Results and Discussion

In this study, He’s (2018) method is adopted to summarize agent-related variables used in the model, as recorded in Table 3 [39]. The configuration of certain variables is based on Fast et al. (2015) [8]. At the onset of the COVID-19 outbreak, many countries and regions adopted social distancing policies, including stay-at-home orders and the closure of restaurants. We first examine the use of social distancing policies and assess the agreement between the model and reality. Figure 4 depicts the changes in the health status of individuals during the transmission of the disease without the implementation of any social distancing policy. Initially, the number of exposed individuals increased quickly as the infection infects susceptible portions of the population and reached a peak incubation period at day 28. As the number of exposed individuals increased, so did the number of infected people, with a peak at day 35. Following the peak, as indicated by the figures, the number of exposed and infected individuals decreased slowly over time, presenting a smoother curve. Throughout the period of transmission, the number of infected people increased, eventually infecting almost everyone. In the second simulation, we added a policy of social distancing on day 22, when the number of confirmed cases rapidly increased, to reduce contact between individuals. The implementation of the social distancing policy was observed to reduce the total number of patients and preserve public health.

At this point, the government’s social distancing policy exerts a controlling effect on the epidemic, including slowing the rate of transmission of the disease, reducing the number of cases at the peak, and preventing the occurrence of medical runs. Figure 2 depicts that the total number of cases decreased significantly after the implementation of the social distancing policy by the government.

Then, we obtained the average search volume for facemasks, medical alcohol, and N95 masks between January 20, 2020, and March 10, 2020, based on the Baidu Index (Baidu is a widely used search engine in China and the Baidu Index is similar to Google Trends), and fitted the average values with the extent of online panic after the implementation of the social distancing policy. was obtained, which suggested a proper fitting. This indicated that the simulation method has practical significance and can be used as an effective guide for the government information disclosure policy. Given the success of the social distancing policy and its widespread adoption in reality, in the subsequent simulation, we explored the impact of different information disclosure policies based on the social distancing policy already adopted.

3.1. The Impact of Social Informatization

Based on the social distancing policy during the outbreak of a disease, we now attempt to examine the influence of social informatization on the extent of online panic. Two levels of social informatization were used for comparative analysis.

The evolution of online panic in two communities with similar structure experiencing the same epidemic with different coverage rates of public media is depicted in Figure 5. The community with a lower coverage rate exhibited a lower level of cyber panic, and the corresponding growth rate and peak value of online panic were lower. However, Fast et al. (2018) identified a significant positive correlation between the intensity of media coverage and the decline of epidemics—when media coverage increased tenfold, the epidemic trend decreased by 33.5%, and communities with a high acceptance rate of public media coverage were conducive to preventing the spread of the epidemic [27]. At present, the level of social informatization is relatively high. This could explain the higher psychological harm caused by epidemic diseases in modern times than in the past. For example, COVID-19 induced online panic to a higher degree than SARS.

Transparency and full disclosure of information are necessary to combat epidemics. However, Figure 5 reveals that online panic was very high corresponding to a high level of social informatization. Therefore, the influence of the mode of disclosure of information on online panic should be investigated to determine the most conducive mode of disclosure.

3.2. The Impact of Data Disclosure Patterns

In this section, we explore the impact of two distinct governmental disclosure policies for epidemic information on cyber panic. Figures 5 and 6 depict the changes in the overall average level of panic over time in each scenario over 50 repeated runs of the model. Two statistics can be used to disclose information about an epidemic—the total number of existing cases and the number of daily new cases. As is evident from Figure 6, adoption of the latter statistic effectively reduces the peak of online panic and delays it compared to the other case, giving policymakers more time to respond to the disease. It follows that reporting the number of daily new cases is a more effective policy.

The disclosure of the number of asymptomatic infections is another decision to be made by policymakers. Asymptomatic infected persons do not exhibit any symptoms of infection but can be detected by tests and the detection rate increases over time. The number of asymptomatic infected cases is usually disclosed in aggregate. It is difficult to estimate this figure during the early stages of an outbreak because a high rate of testing is required, which is usually achieved later. Asymptomatic infected persons do not require specialized medical treatment, and most of them heal on their own. Therefore, in our simulation, we selected four key points in time—the first day, the 15th day, the 30th day, and the 45th day—to correspond to the beginning, inflation, peak, and recession phases of the epidemic, respectively.

As indicated by the data presented in Figure 7, disclosing the number of asymptomatic infected patients yields better public informetrics once the number of new infections per day has significantly reduced during the recession phase, exhibiting the lowest peak of online panic. Generally, the faster the public health sector releases information, the higher its transparency and credibility are. It also makes it more likely to calm rumors and foster the steady evolution of public opinion. However, the increase in the number of asymptomatic infected patients cannot be attributed solely to the escalation of the epidemic. It is also significantly influenced by the increase in the detection rate, which, in turn, increases with time. Thus, by disclosing the number of asymptomatic infected people during the recession phase of the epidemic, policymakers can positively impact public opinion, thereby reducing the peak of online panic.

4. Conclusions

In this study, an agent-based simulation was used to consider the effects of both online and offline factors on online panic. During the offline analysis, an improved epidemic dynamics model based on the SEIR epidemic dynamics model is proposed to simulate offline epidemic transmission. In the online component, online community networks are analyzed to propose a scale-free Internet community relationship network, which is used to simulate the transmission of online panic. COVID-19 was selected as a case study to simulate the spread of online panic and the Baidu search index was used to fit the data to the extent of online panic to validate the model. The effectiveness of different information disclosure strategies (such as the type of cases disclosed and the disclosure of asymptomatic infected persons) in response to online panic during disease outbreaks was also assessed. By collecting short essays from Chinese social media for sentiment analysis, the study divided the transmission of the disease into increasing and declining stages with respect to the time and stage of its development. We analyzed the variations in emotions of netizens with respect to the published epidemic data during the two periods. Then, we explored the impact of media penetration on social panic. The study found that high media penetration rates lead to high social panic responses. In the past, when the media penetration rate was low, pandemic-induced online panic rose slowly and soon began to wane. This research is expected to help mitigate the economic impact of epidemics by optimizing information dissemination policies, increasing public trust, and reducing panic.

This study concluded that social distancing policies are effective—in the presence of such policies, simulation results indicated that increased social informatization levels induce more substantial online panic during the outbreak. To reduce the economic impact of epidemics, we suggest that the government should disclose the number of daily new infections rather than the total number of infected cases and withhold the announcement of the number of asymptomatic infections till the peak of symptomatic infections is attained. According to our simulations, this should reduce the intensity of online panic and delay its peak, which would also reduce the adverse impact of social response on the economy.

Finally, this study also contributes to economic development. In 2020, several countries adopted stimulus policies, such as proactive fiscal or loose monetary policies, to help enterprises tide over difficulties and stimulate consumer spending. The conclusions of the study demonstrate clear methods to control and reduce online panic, which, in turn, will help boost public confidence and stimulate consumer spending. This would empower the government’s stimulus policies. Moreover, this model can be extended to the field of businesses to help corporate and commercial organizations make better decisions. For example, this model can be adapted to predict the proportion of people who want to attend sales promotional activities during an epidemic, which can inform the decision to hold such an event.

This study suffers from certain shortcomings. Only two categories are used based on NLP sentiment analysis, thus supporting the agent-based model’s hypothesis. However, in the real world, emotions may fall into multiple categories. Therefore, we intend to pretrain the model in future works using deep learning-based methods to complete model training and sentiment analysis, e.g., the word embedding model, and further refine the model.

Data Availability

Textual data and program code can be obtained from https://github.com/downw/SocialMediaText.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China under Grant no. 71904106, the Ministry of Education of Humanities and Social Science Project of China under Grant no. 19YJC870019, the China Postdoctoral Science Foundation under Grant no. 2018M632688, and the Postdoctoral Science Foundation of Shandong Province under Grant no. 201903009.