Scientific Programming

Volume 2019, Article ID 2753152, 12 pages

https://doi.org/10.1155/2019/2753152

## Intelligent Behavior Data Analysis for Internet Addiction

Correspondence should be addressed to Xinlei Zhang; moc.liamg@7991gnahzielnix

Received 26 August 2019; Revised 21 October 2019; Accepted 7 November 2019; Published 29 November 2019

Guest Editor: Aibo Song

Copyright © 2019 Wei Peng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Internet addiction refers to excessive internet use that interferes with daily life. Due to its negative impact on college students’ study and life, discovering students’ internet addiction tendencies and making correct guidance for them timely is necessary. However, at present, the research methods used in analyzing students’ internet addiction are mainly questionnaires and statistical analysis, which relies on the domain experts heavily. Fortunately, with the development of the smart campus, students’ behavior data such as consumption and trajectory information in the campus are stored. With this information, we can analyze students’ internet addiction levels quantitatively. In this paper, we provide an approach to estimate college students’ internet addiction levels using their behavior data in the campus. In detail, we consider students’ addiction towards the internet is a hidden variable which affects students’ daily time online together with other behavior. By predicting students’ daily time online, we will find students’ internet addiction levels. Along this line, we develop a linear internet addiction (LIA) model, a neural network internet addiction (NIA) model, and a clustering-based internet addiction (CIA) model to calculate students’ internet addiction levels, respectively. These three models take the regularity of students’ behavior and the similarity among students’ behavior into consideration. Finally, extensive experiments are conducted on a real-world dataset. The experimental results show the effectiveness of our method, and it is also consistent with some psychological findings.

#### 1. Introduction

Internet addiction disorder refers to excessive internet use that interferes with daily life [1]. Some research shows that the addiction towards the internet has a negative impact on college students, such as the backwardness of study, health, and social relationship [1–3]. Therefore, it is necessary to discover students’ addiction tendencies towards the internet and make correct guidance for them.

At present, related works of internet addiction are concentrated on psychological fields. Such works focus on the causes, the influence of internet addiction, and internal mechanisms leading to internet addiction, together with methods to eliminate internet addiction. There are few works on calculating internet addiction levels quantitatively. Besides, the methods used for analyzing are mainly questionnaires and statistical analysis, which are cumbersome and relies on the domain experts heavily. Therefore, it is necessary to develop an approach to explore students’ internet addiction level quantitatively and automatically.

Fortunately, with the development of the smart campus, students’ behavior data are collected, such as the access data and consuming data. With these data, it is possible to analyze students’ internet addiction levels quantitatively.

To this end, in this paper, we propose an approach to estimate students’ internet addiction levels using their behavior data. Currently, there is no method to evaluate students’ addiction level precisely, so we are unable to study it with supervised methods explicitly. Instead, we can calculate students’ internet addiction level through another task. In detail, based on the definition of internet addiction, we consider that the student’s internet addiction level is a hidden variable, which will affect students’ daily time online. Besides, student’s behavior data such as consuming data and the internet access gap reflect student’s daily activities, which may also influence the time they spend online. Then, we can predict students’ online time with their behavior data and internet addiction level. Through such a task, the internet addiction value can be inferred. Along this line, we propose a linear internet addiction (LIA) model, a neural network internet addiction (NIA) model, and a clustering-based internet addiction (CIA) model to capture the relationship between students’ behavior data, internet addiction, and the time they spend online every day.

Furthermore, students have fixed disciplines every week, which leads to the regularity of time they spend online every week. LIA and NIA models take the regularity of students’ behavior into consideration, and the CIA model mainly uses the relationship among students’ behavior to learn their internet addiction level. Finally, we conduct extensive experiments on a real-world dataset from a Chinese college, including internet addiction calculation, internet addiction verification, and internet addiction analysis experiments. Particularly, to verify the internet addiction value we calculate is credible, we compare our results with the results evaluated from the psychological scale. The experimental results demonstrate the correctness and effectiveness of the model we propose. And the results are also consistent with some psychological findings.

#### 2. Related Work

The main related work of this paper can be divided into two parts: internet addiction analysis and campus data mining.

##### 2.1. Internet Addiction Analysis

Internet addiction analysis is a research direction in the psychological field. Some works focus on the causes of internet addiction. Researchers found that interpersonal difficulties, psychological factors, social skills, etc., are all reasons for internet addiction [1, 4, 5]. Other works aim at finding the influence of internet addiction. Upadhayay et al. claimed that excessive use of the internet would lead to the drawback of the study [2]. He et al. explored internet addiction’s influence on the sensitivity towards punishment and award [6]. Their result shows that people with serious internet addiction are more sensitive to risk. There are also some works about the inner mechanism of forming internet addiction. Zhang et al. focused on the inner reason of family function’s negative influence on internet addiction. They revealed that the stability and development of family might affect users’ mental situations such as dignity and loneliness, and then such mental situations will have an influence on internet addiction [7]. Zhao et al. noticed that stressful life events make users feel depressed, which causes the user addicted to the internet [8].

##### 2.2. Campus Data Mining

Data are produced everywhere in our daily life activities, for example, the consumption records, chatting records, web browsing records, and so on. Using such data, we are able to make some interesting applications, such as tag recommendation, which suggests a list of tags when a user wants to annotate an item. Wang et al. proposed the TAPITF model to combine both time awareness and personalization aspects into tag recommendation task [9]. Campus data mining refers to solving problems on campus with data mining methods. Some works mainly analyze students’ daily behavior in life. Guan et al. predicted students’ financial hardship through their smart card usage, internet usage, and students’ trajectories on campus (Dis-HARD model) so that the school can offer those students with stipend portfolios [10]. Based on this work, Ye et al. proposed a model [11], which predicted stipend portfolios with multimodal data. Their work has higher accuracy compared to the Dis-HARD model and protects students’ privacy. The Bayesian method is widely used in many fields. Wang et al. proposed a Bayesian probabilistic multitopic matrix factorization model for rating prediction [12]. And similarly Zhu et al. proposed an unsupervised method under the framework of empirical Bayes to calculate students’ procrastination value with their borrow info in the library [13]. Peng et al. proposed a deep topical correlation analysis approach to track students’ thoughts and serve the development of smart campus using multimodal data [14]. There are also some works aiming at analyzing students’ studying process and improving their performance in class, which is called educational data mining (EDM). For example, Burlak et al. identified if a student is cheating in an exam by analyzing their interactive data with online course systems such as start time, end time, IP address, and access frequency [15]. Abdi et al. predicted students’ grades based on their answers to usual work and duration of stay on a question [16].

Above all, to the best of our knowledge, there is no work on analyzing internet addiction using students’ daily behavior. And we are the first to analyze internet addiction based on their behavior data with data mining methods.

#### 3. Preliminaries

Internet addiction is an abstract concept in the psychological field, so it is hard to give a measurable definition of internet addiction. To solve this problem, we first make a reasonable assumption about internet addiction. Then, based on this assumption, we calculate the internet addiction value using students’ behavior data.

##### 3.1. Internet Addiction Assumption

Psychological research shows that most college students are addicted to the internet [17]. And we mentioned that internet addiction refers to excessive use of internet interfering with daily life. Therefore, students with different internet addiction levels are very likely to spend different time online. Besides, different behaviors show the different activities in school, which in turn also leads to different online time. And students of different genders or departments will also have some differences in the internet use.

Based on such fact, we assume that internet addiction is a hidden factor, which may influence students’ daily time online together with their behavior and profile information. Therefore, we will learn such factors by modelling how students’ internet addiction and behavior influence daily online time. To simplify the problem, we also assume students’ internet addiction level will not change in a semester.

##### 3.2. Problem Formulation

Since we do not have any label about internet addiction level, we cannot use supervised methods to study students’ internet addiction value. Thus, we need to estimate it through some known data. Based on our assumption that the internet addiction value is a hidden variable, which may affect the time students spend online, the value can be learned by predicting students’ daily online time.

Formally, we define as the internet addiction level of student *u*. Daily time online sequence of student *u* during a period *T* is represented as . And the daily behavior sequence of *u* during the same period is represented as . We also define the personal profile information of student *u* as . Our task is to model the relationship , which is how students’ behavior and internet addiction influence their daily time online. Then the internet addiction level can be calculated from this model. Note that *t* above is in the set *T*.

#### 4. Internet Addiction Calculation Model

To calculate students’ internet addiction level, we propose three internet addiction calculation models: the linear internet addiction (LIA) model, the neural network internet addiction (NIA) model, and the clustering-based internet addiction (CIA) model. For the LIA model, we mainly consider the linear relation between students’ behavior, internet addiction level, and their daily online time. Furthermore, since the neural network is powerful to capture the higher order relation among features, we explore the NIA model to find that nonlinear relation between students’ behavior, internet addiction level, and their daily online time.

As for the CIA model, instead of directly studying the relation between students’ behavior, internet addiction level, and their daily online time, we think that students who spend more time online than the normal online time are more likely to be addicted to the internet. So we devise a clustering-based method to find the normal online time and then regard the difference between students’ actual online time and the normal online time as their internet addiction level.

In this chapter, we first describe these three models in detail, and then we will discuss the advantages and disadvantages of each model.

##### 4.1. Linear Internet Addiction (LIA) Model

In this section, we first introduce how we use a linear model to reveal the relationship of . Then to strengthen the model, we take the regularity of students’ behaviors into consideration.

###### 4.1.1. Naive LIA

Based on the internet addiction assumption, the behavior is a factor which will influence students’ online time. However, different kinds of behavior may have a different effect. Therefore, a weight vector is necessary to represent the different effects of each kind of behavior. The impact of behavior on online time is not different in individuals, so every student shares this weight vector. We deal with different kinds of personal attributes in the same way. Besides, even two students have the same behavior and personal attributes, and they may still spend different time online because of the difference in their addiction level towards the internet. We suppose that different internet addiction level is the only reason which causes different time online with the same behavior and personal attributes. Here comes our naive linear internet addiction model:where represents the duration student *u* spend online at time *t*. refers to the combination of behavior vector and personal attributes of student *u* at time *t*, and is the weight vector of that combined vector. here is the internet addiction level of student *u*. Our task is to find the value of and that minimize the loss function, that is,

The item is used to prevent the model from overfitting. can be used to adjust the weight between the behavior and internet addiction.

###### 4.1.2. LIA with Regular Behavior

College students usually have a fixed curriculum. Therefore, their behavior has some regularity every week, which will also lead to the regularity of the time they spend online. Take student *u* as an example; courses on Monday are kind of boring, so he spends a lot of time surfing the internet. However, courses on Tuesday are hard, which means he must pay attention to the class, so he may not surf the internet in class. Based on such facts, it is necessary to take the regular online time into consideration.

So, we modify our linear internet addiction model by adding an item to represent the regular online time of student *u* at time *t*. Due to the characteristics of the college study, they perform similar online habits every week. So here means which day of time *t* is of the week it belongs to, and means the regular online time of the day *x* of the week. Here comes our new model:

For the convenience of calculation, we define as an 8-dimensional vector with the first item one standing for the internet addiction and others being a one-hot representation of the week. The formula above is equal towith being equal to

Our task is to find a suitable and that will minimize the loss function, the first item of is the internet addiction level of student *u*:

Similarly, we add to prevent the formula from overfitting, and we use the formula to adjust the weights between behavior, personal attributes, internet addiction level, and regular habits.

##### 4.2. Neural Network Internet Addiction (NIA) Model

The neural network is able to model the high-level relationship among features. It is powerful in a variety of application scenarios [18–20]. For example, in the tag recommendation task, Yuan et al. utilized the multilayer perceptron to model the nonlinearities of interactions among users, items, and tags [21]. In this section, we develop a neural network internet addiction (NIA) model to represent the nonlinear influence of students’ behaviors, personal attributes, internet addiction, and their regular behavior on their daily online time.

###### 4.2.1. Network Structure

The neural network consists of two parts: the public part and the private part. We use the public part to represent that the effect of the behavior and personal attributes on daily online time is not different in individuals, which means the input of the public part is the combination of the behavior vector of student *u* on time *t* and his personal attributes vector . The weight matrix and the threshold vector of this part will update every iteration.

Because the internet addiction level and regular behavior are different in individuals, we use a private part to depict such characteristics. Every student has his own weight matrix and threshold vector , and the parameters will only be updated when the corresponding student’s data are used as the input. The private input of student *u* on time *t* is the same as vector (5). To ignore the influence of regular behavior, we can also only keep the first item of vector (5).

The target output of the model is the actual online time of student *u* on time *t*: .

The structure of the network is shown as Figure 1.