Abstract

It is of great significance to mine the learning resources that learners are interested in from massive data and recommend appropriate educational resources to them according to the characteristics of students. To improve the accuracy of educational resource recommendation, this paper proposes an educational resource recommendation system based on a graph attention network and conditional random field fusion model. It builds all comments for each student and educational resource into a comment graph. Through the graph's topological structure to capture the network and the dependency between words in the commentary text, the adjacency information of each node is aggregated by the graph attention network based on connection relation. After the graph attention network layer, the conditional random field inference layer is added. The label sequence with the highest probability is output by the dependent random field inference layer, which is taken as the final recommendation result of the model. Experimental results show that the proposed algorithm has better performance in accuracy and diversity than the traditional recommendation algorithm.

1. Introduction

With the development of social science and technology, many online learning resources have grown. Learners cannot find the learning content from the massive and complicated data. In the process of searching, learners will be interfered with by a lot of irrelevant information, thus wasting their time and continuously decreasing their learning efficiency and interest [1]. Therefore, how to recommend learning resources that users are interested in has become the primary research content of this subject [2]. By combining personalized recommendation with learning, learners can obtain more targeted learning resources in the learning resource recommendation system more accurately and quickly [3].

As important auxiliary information in the recommendation system, comment text can describe users’ interests and hobbies in different aspects of the project [4]. Recommendation methods, based on comment text, mainly include topic modeling methods and deep learning methods (such as convolutional neural networks and cyclic neural networks) [5]. Although these methods have achieved specific performance improvements, they still have limitations. Topic-based modeling methods can only capture the semantic information of the text at the global level and ignore the critical word order and word context information in the text [6]. Based on the deep learning method, we can effectively capture the adjacent word context information, in the long-term, global capture between word and the word, and the continued dependence of [7] has some limitations. However, this kind of method only considers the single static preferences of the user or the project side and fails to capture the preference characteristics at the interaction level [8].

In the recommendation field, foreign literature [9] adopts a collaborative filtering algorithm to recommend interested and appropriate online learning resources for different learners according to their other goals and interest directions. Literature [10], through studying learners' characteristic preferences and activity behaviors, improved the shortcomings in the modeling process and established a dynamic preference model for further recommendation. Regarding domestic recommendation technology, literature [11] proposed associating learning resource tree with user access records, establishing a user preference matrix and making calculations in a comprehensive way of collaborative filtering and popular recommendation. Literature [12] presented a new technology, namely, the calculation method of assigning information entropy and changing it according to the attribute value change, finally obtaining the weight of user preference. Experimental results show that the recommended result is more accurate. Literature [13] mines and analyzes data from multiple perspectives, which is embodied in constructing a multiassociated data warehouse for the data in user system logs. Literature [14] proposed the long-short Time Memory (LSTM) network by enhancing the traditional recurrent neural network. Literature [15] established a recommendation model of online learning resources and provided personalized learning resource sequences for learners by solving the learning resources with the most negligible differences with learners' characteristics through binary particle swarm optimization. Literature [16] constructed a learning resource recommendation system from the perspective of semantic Web ontology and mapped learner features and learning resource features into the ontology for matching recommendations. Literature [17] proposed using ontology integration to learn resource relationship features and then carry out recommendation calculations through a genetic algorithm. Literature [18] proposed a new e-learning intelligent recommendation system, which can evolve itself. The system realizes the self-adaptation of learners and the self-adaptation of an open network environment. Literature [19] introduces semantic Web technology into personalized recommendation services in the network learning system and proposes an intelligent recommendation system based on semantic discovery and learning preference.

In view of the above problems, this paper proposes an educational resource recommendation system based on a graph attention network and conditional random field fusion model. The recommendation algorithm RGP is used to construct a review graph from all the review sets of each user or project. The graph topology captures long-term, global, and discontinuous dependencies between words in the commentary text. The adjacency information of each node is aggregated using a graph attention network based on connection. The word order relation is considered. Conditional random field (CRF) inference layer is added after the BiLSTM network layer. CRF limits the previous model's output by considering the relationship between adjacent labels to ensure the rationality of prediction labels. Finally, the embedded representation of the user and item ID and their comment graph represents the coupled input, and a Factorization Machine (FM) is used to predict the score.

2. The Proposed Educational Resource Recommendation System

2.1. Recommendation Model for Learning Representation Based on Comment Text Graph

In this paper, represents the comment set generated by user p. represents the set of comments received by education resources x, where and represent the total number of comments contained by user p and education resources x, respectively. The score of user p on education resources x is defined as , and all the score datasets are represented as D.

The RGP network structure of the recommendation algorithm based on comment text graph representation learning is shown in Figure 1. It contains three modules: the user module (the left two columns of Figure 1). Education resources modules (the right two columns of Figure 1). Prediction module based on FM [20]. The user module has the same network architecture as the education resources module, which users learn. Education resources modules are used to represent projects. The prediction module takes the user and the education resources representation as input and calculates the user's rating of the education resources.

The user (education resources) module consists of three parts: The section that builds the comment text graph builds the comment text set for each user's education resources) into a graph. In graph representation learning, the connection-based graph attention network and the interaction-based attention mechanism are used to extract the representation of the whole graph. The representation fusion part coupled the embedded representation of user (item) ID and its graph representation to obtain the indication of end-user (item). The network architecture of the user module is the same as that of the education resources module. The details of each part of the user module are described below.

2.2. Comment Text Graph Construction

This paper uses the method of reference [20] to construct a review text graph. In the comment text, if two words appear together in a window of size (i.e., the distance between the two words is less than ), then the two words are connected. This paper saves the word order information on this basis. For user p's comment text set , first, keywords of each comment in are selected using text preprocessing techniques such as sentence segmentation and preposition clearing. Then, all comments are constructed into a directed graph, in which the nodes are the keywords of the comment text, and the edges in the graph describe the cooccurrence relationship between words in a sliding window of fixed size . The word order relationship of critical text is significant in reflecting the semantic meaning of the text. For example, “not very good“ and “very bad“ convey different levels of negative emotion. To preserve the word order information in the graph, this paper defines three types of connection relation: forward relation , backward relation , and self-connection relation . Take a comment from the comment set . If the selected keyword (node) appears before the keyword in and the distance between and is less than ω in the original comment, an edge from to is established in the graph. The connection of this edge is . At the same time, an edge from to is established, and the connection of this edge is . If appears both before and after (which is rare), the edge connection is randomly set to or . In addition, to consider the information of the word itself, each node in the graph adds an edge connected to itself (for example ) and defines its connection as . Figure 2 shows an example of structuring the comment, “Friends love this nice durable mouse.” “Friends,” “like,” “good-looking,” “durable,” and “mouse“ were all selected as keywords in the comments. The word “this“ is removed, and the window size is set to 3. If the distance between the keyword “good-looking“ and the three keywords (“like,” “durable,“ and “mouse“) in the comment is less than 3, 3 bidirectional edges are established, and edge types are defined according to word order. And so on, build the comment text graph.

For user p, is used to represent its corresponding comment text graph. is the set of nodes (i.e., keywords), is the triplet set of node-edge-to-node , and r is the connection relation between nodes and (one of the above three relations). A comment text graph of project x, can be built in the same way.

2.3. Figure Represents Learning
2.3.1. Embedded Layer

The input of the embedding layer, user ID, comment text ID, word ID, and connection relation ID is mapped to different embedding spaces to obtain the corresponding low-dimensional embedding features. In this paper, are, respectively, used to represent the low-dimensional embedding features of user p, comment text x, word i, and connection relation r, where is the vector dimension of the embedding space.

2.3.2. Graph Attention Network Based on Connection Relation

The information of words in the commentary text is not independent, and the semantic information of a word can be enriched by the words around it. A graph attention network based on connection relation is proposed to aggregate adjacency information. Assume that the input in , for the nodes in the graph , uses says adjacent points, contains itself. Assuming that it is currently at layer l of the graph attention network, the importance weight of the adjacent point can be calculated as follows:where are transformation weight matrices of corresponding nodes and connection relations, respectively. are the vector representations of and at the l layer, respectively, and the initial l layer representation is the low-dimensional embedding characteristics of the output of the corresponding embedding layer. is the LeakyRelu activation function. is the vector dimension of the representation space. at the first layer and at other layers.

The importance weight describes the importance degree of adjacency. According to this weight, the vector representation of the adjacency is integrated, and the output vector representation of is obtained as follows:where Tanh is the activation function. The long-term dependence of words in the commentary text is captured by stacking multilayer graph attention networks. Assuming the number of stacking layers is l, the corresponding output vectors of l can be represented as .

2.3.3. Attention Mechanism Based on Interactive Relationship

When all nodes of graph pass through the l layer graph attention network based on connection relation, a graph attention mechanism based on interaction relation is proposed in this paper. The aggregation node representation gives the representation of the whole graph. The attention mechanism assigns importance weights to the interaction level to each node in the graph based on information from user p and comment text x. Assuming that the representation of output nodes of the graph attention network at the above layer l is aggregated, the weight of node can be calculated as follows:where is the transformation weight matrix. ||for vector concatenation operation. According to the weighted sum of node representation of output, the output representation of layer l of graph can be obtained as follows:

2.3.4. Representation Fusion

To improve the expression ability of the model, a layer of nonlinear transformation is applied to the low-dimensional embedding feature based on user ID.where is the transformation weight matrix. The final representation of the user is the concatenation of and the graph representation output above at all levels.

Following the same process, the final representation of comment text x, can be obtained.

2.4. CRF Inference Layer

CRF inference layer is added after the BiLSTM network layer to make the model learn the constraint information between tags. CRF limits the output results of the previous model by considering the relationship between adjacent labels to ensure the rationality of predicted labels. The steps of the CRF algorithm are as follows:(1)For the input sequence , for a given tag sequence , the score is shown in the following equation:(2)where G represents the transition score matrix and , and represents the transition score from label x to label y. and represent the start and end tags in a sentence. The matrix is the output of the BiLSTM layer and . represents the output score of the word under the tag. represents the length of the sequence, and z represents the number of tags.(3)Softmax function is used to normalize and obtain the maximum probability of sequence j label, as shown in equation as follows:(4)where represents the actual value. represents the set of all possible tags. During training, the likelihood probability of the correct tag sequence is maximized as shown in the following:(5)The sequence with the highest predicted total score among all the sequences is regarded as the optimal sequence, that is, the final text recognition result of educational resources as shown in the following equation:

2.5. Score Prediction

In this paper, a FM calculates users’ ratings of educational resources. First, the final representation of users and educational resources is combined as follows:

The scoring is calculated as follows:where are the global deviation, user deviation, and project deviation, respectively. is a weight vector, and d; are the potential factor vectors corresponding to the elements of the tth and Wth dimension of k. is the value of the element in the t-dimension of k. is an inner product operation.

To learn the parameter of the whole model, this paper defines the loss function of the model as follows:where is the regularization coefficient. The whole model can be trained efficiently by an end-to-end backward propagation algorithm.

2.6. Algorithm Flow

The RGP algorithm flow is as follows (Algorithm 1).

(i)Enter the user's rating data D for educational resources, the text set of comments by users, and educational resources
(ii)Output RGP prediction model
(1)Randomly initialize all parameters of the model θ
(2)For p, item x, in score data D.
(3)Construct the comment text map and of user p and educational resource x.
(4)Assume based on equations (1) to (8) and equations (13) to (14).
(5)According to equation (15), the backward propagation algorithm is used to calculate the gradient of all parameters θ.
(6)Gradient descent method (Adam) is used to update parameters
(7)end for
(8)return
2.7. System Module Design

A college education information recommendation system based on the fuzzy algorithm of multiple mixed criteria is an education information recommendation system with a retrieval engine. The system consists of a retrieval module, database module, and recommendation display module. The web retrieval engine is set in the retrieval module, which is used for educational information retrieval and efficient transmission of educational data. The block diagram is shown in Figure 3.(1)Retrieval module: After the user logs into the system and enters the retrieval module, he/she selects the corresponding language according to the language he/she knows. The retrieval module supports Mongolian, Chinese, and English. Users use three languages to input the types of educational resources, keywords, and subject information in the search interface.(2)Database module: This module has various management modes for educational information and corresponding management for system and user information. Users can be classified as registered users, common users, and management users. Registered users can retrieve educational information for browsing. General users can search for educational information and download educational information. Management Users have the right to manage the overall functions of the database.(3)Recommendation display module: The recommendation display module mainly presents the education information with the highest recommendation degree to users. The recommended display module has three main panels: The first is the list of learning resources. The second is the list of recommended resources. The third is the neighbor list. Click the names of lists in the recommended display module panel to activate the corresponding functions. The downloaded learning resources are displayed in the learning resource list. The list of recommended resources shows that the database module extracts the education information with the highest recommendation degree according to the user's retrieval information. The neighbor list lists all educational information similar to that retrieved by the user.

3. Experiment

3.1. The Data Set

To verify the effectiveness of the proposed algorithm, public datasets Citeulike-c and Citeulike-h were used in the experiment. Table 1 shows the statistical information of nodes and the relationship between users' nodes, educational resources, and labels in the two datasets. The sparsity of interactive data in the two datasets is 0.23% and 0.08%, respectively.

3.2. Experimental Methods

The model implements the recommendation model based on comment text graph representation learning. After the dataset is divided, the last interaction is reserved as a positive sample for testing for each user in the dataset. The remaining interactions are positive samples for training. 1000 comments with no previous interaction were randomly selected for each user as a negative test sample. The feature dimension is 64, the word vector is a pretraining model, and the output dimension is 768. The autoencoder adopts a three-layer neural network structure, and the dimension of the middle layer is 64. Feature random zero ratio 0.3.

3.3. Evaluation Indicators

This paper evaluates the performance of the model from two aspects: accuracy and diversity. In terms of accuracy, choose HR@K and NDCG@K. The recommended length is K. HR@K is the hit rate, which measures the proportion of users' test positive sample educational resources in educational resources in the recommendation list. NDCG@K is the normalized impairment cumulative gain that measures the ranking quality of the recommendation list. The higher the user's test positive sample educational resources, the greater its value. The accuracy of the model is directly proportional to the size of the two values.

In terms of diversity, choice ILS@K and HD@K, K is the length of the recommended list. ILS is the internal list similarity, which measures the similarity of a single user's recommendation list. The larger the ILS value, the higher the similarity and the lower the diversity of a single user's recommendation list. This paper uses the cooccurrence vector of educational resources and tags to calculate the similarity of ILs. HD is the Hamming distance. In information theory, the Hamming distance between two equal-length strings is the number of different characters in the corresponding position of the two strings. Measure the similarity of recommendation lists between different users. The larger the HD value, the lower the similarity between different users and the higher the diversity.

3.4. Experimental Results and Analysis

The experiment compares the evaluation indexes of the proposed algorithm and the traditional recommendation algorithm. To unify variables, K in the recommendation list Top-K is set to 10. The experimental results on the two datasets are shown in Table 2 and 3, respectively. It can be seen from the experimental results:(i)The performance of the proposed algorithm is superior to that of other models in both datasets. Compared with [25], in the experimental results of CiteULike-c dataset, the algorithm in this paper improves by 4.62% on NDCG and 0.29% on ILS. In the experimental results of the CiteULike-h dataset, the algorithm in this paper improved by 3.53% on NDCG and 1.17% on ILS. This is due to multisemantic feature extraction, 3d convolution high-order feature mining, and diversity loss function.(ii)Compared with literature [21], literature [22] and literature [23] increased by 1.61% and 0.98% in HR. In NDCG, the results improved by 6.64% and 6.06%, indicating that text and heterogeneous information networks can effectively alleviate the problem of data sparsity and improve the accuracy of recommendation. It shows that semantic features of heterogeneous information networks can effectively enhance the diversity of recommendations.(iii)The evaluation index of the accuracy of the deep learning model is improved by 10%–14% compared with the method based on matrix decomposition, indicating the advantage of neural networks in fitting high-order interaction relations. In the deep learning model, HR indexes of literature [24] and literature [25] were compared, and the latter improved by 2.21%, indicating that the fusion method of cross-product could obtain more accurate recommendation results.

3.5. System Effect Test

The system is applied in a school library for 70 days. The experimental users were 1500 teachers and students, and the system was used to search books. The experiment shows that the system in this paper can recommend related books according to user search terms, and the recommendation degree of recommended books is 99%. Note the system in this paper can realize book recommendations according to the user preference and the optimal recommendation degree. To deeply test the application effect of the system in this paper, the user discovery accuracy and recommendation recall rate are taken as test indexes, and the performance comparison experiment is conducted by using the system in this paper, the recommendationsystem in literature [26], and the recommendation system in literature [27].(1)User discovery accuracy: the ratio of the number of educational information recommended by the system selected by users to the total number of recommendations.(2)Five hundred teachers and students were randomly selected as users from 1500 teachers and students in the school. When the number of users increased, the three systems' user discovery accuracy was tested. The results are shown in Table 4. According to the data in Table 4, as the number of users increases, the three systems' user discovery accuracy begins to decrease. However, the user discovery accuracy of the system in this paper decreases in a small extent and at a slow speed. When the number of users increased from 400 to 500, the system users found that the accuracy was stable at 0.94. When the number of users increases from 400 to 500, the discovery accuracy of the users of the two systems is less than 0.88.(2)Recommended recall rate: the ratio between the amount of recommended educational information selected by users and the total amount of educational information applied by users.

Based on the above experimental settings, the recommended recall rates of the three algorithmic recommendation systems were tested, and the comparison results are shown in Figure 4.

By comparing the fluctuation trend of the recommendation recall rate of the three algorithmic recommendation systems in Figure 4, the peak value of the recommendation recall rate of the proposed algorithmic recommendation system is greater than 0.92, showing significant advantages. As the number of users increases, it always ranks above the other two algorithmic recommendation systems with minimal fluctuation. The lowest recommendation recall rate of the other two algorithmic recommendation systems falls below 0.81, so most of the educational recommendation information of the algorithm system in this paper is adopted by users.

4. Conclusion

This paper proposes an educational resource recommendation system based on the review text graph representation learning recommendation algorithm RGP combined with a CRF fusion model. The system effectively combines the performance advantages of comment text and graph representation learning. By introducing the graph attention network based on connection relation and the attention mechanism based on interaction relation, the relevant information between word and word, interaction behavior, and comment text can be more fully captured. The label sequence with the maximum probability is output by the CRF inference layer and taken as the final recommendation result of the model. Experimental results on datasets show that the proposed algorithm can effectively improve the recommendation accuracy compared with the traditional recommendation algorithms. The resource recommendation system proposed in this paper has significant advantages in recommendation recall rate and high discovery accuracy from the system effect test. The next step will be to introduce more nonscoring auxiliary information to establish a more accurate distribution of user preferences and item features to improve recommendation accuracy. [27].

Data Availability

The labeled dataset used to support the findings of this study is available from the author upon request.

Conflicts of Interest

The author declares no conflicts of interest.

Authors’ Contributions

Erqi Zeng contributed to the writing of the manuscript and data analysis. Dalei Jiang supervised the work and designed the study . All the authors have read and agreed with the final version to be published.

Acknowledgments

This work was supported by Henan Educational Science Planning Project: Research on the construction of a three-dimensional foreign language teaching model in Colleges and Universities under the background of educational informatization (2019-JKGHYB-0182).