Intelligent Techniques for Simulation and ModellingView this Special Issue
Research Article | Open Access
Cross-Domain Personalized Learning Resources Recommendation Method
According to cross-domain personalized learning resources recommendation, a new personalized learning resources recommendation method is presented in this paper. Firstly, the cross-domain learning resources recommendation model is given. Then, a method of personalized information extraction from web logs is designed by making use of mixed interest measure which is presented in this paper. Finally, a learning resources recommendation algorithm based on transfer learning technology is presented. A time function and the weight constraint of wrong classified samples can be added to the classic TrAdaBoost algorithm. Through the time function, the importance of samples date can be distinguished. The weight constraint can be used to avoid the samples having too big or too small weight. So the Accuracy and the efficiency of algorithm are improved. Experiments on the real world dataset show that the proposed method could improve the quality and efficiency of learning resources recommendation services effectively.
With the continuous development of network technology and educational informationization, the network learning system is widely applied in every stage of education. How to improve the intelligence of the network learning system and the users’ efficiency has become a common concern for all researchers. It is more and more difficult for students to look for their interested resources from the increasing learning resources in the network learning system. So adding learning resources recommendation service to the network learning system can save the students’ time and effort on information searching and can free the students from enormous network information resources. Under these conditions and needs, the personalized learning resources recommendation service technique has been gradually developed.
The personalized learning resources recommendation service refers to predicting whether the new learning resources can meet the students’ personalized demand which is extracted from the previous learning information and recommending the interested resources to the students.
In order to obtain good recommendation results, a certain amounts of personalized information should be accumulated. The accumulation time is proportional to the student’s learning time in the network learning system. Shortening the accumulation time is an effective method to improve learning resources recommendation service efficiency. The cross-domain learning resources recommendation is an effective method to shorten the process of accumulating personalized information.
In this paper, a cross-domain personalized learning resources recommendation method is presented. First, a cross-domain personalized learning resources recommendation service model is introduced. Next, a personalized information mining method is designed. Then, a cross-domain learning resources recommendation algorithm is presented. Finally, the simulation results and the experiment analysis are given.
2. Service Model
There are large amount of learning resources in network learning system. Generally, the student does not know if it contains the interested contents when he studies new resource. So, he can only check the content or the resource description. If it does not contain the interested content, the checking is useless. In order to reduce the useless checking, the recommendation service should be added to the system. For improving the service efficiency, a cross-domain personalized recommendation service model is presented in this paper. The service model is shown in Figure 1.
As shown in Figure 1, when students use the network learning system, some web logs data are generated. These data include user name, access date, access times, access time, and requested URL. Through mining the data, the personalized information of students are obtained. Put the personalized information as a basis for personalized learning resources recommendation into the student personalized information database. When a student learns some subject, the service can extract learning information from current subject and the subjects learned before based on his personalized information, and train the recommendation model. Through the recommendation model, the service can find the learning resources which the student is interested in the current subject and recommend them to the student.
3. Personalized Information Mining Based on Mixed Interest Measure
For a student, what learning resources they are interested in is the most easy and effective way to reflect his personalized demands. From the web logs, the service can find out the learning resources that the student has browsed, analyze what they are interested in, what they are uninterested in, and then compose the personalized information of the student.
Now, browsing interest measure  is widely used to estimate whether the users are interested in the browsed contents or not. The browsing interest measureis computed as: whereis the browsing times,is the each browsing time, andis the number of bytes of the browsing content.
In network learning system, the file types of learning resources are not only text but also image, animation, audio, and video. In formula (1), the difference from different file types is not considered. In addition, the learning resources not only include the resource content but also include the resource description which is called the brief of resource. According to the characteristics of learning resources, the mixed interest measureis computed as follow.
First, the content browsing interest measureis computed as the following: where is the browsing times,is the each content browsing time, is the number of bytes of the resource content,is the coefficient of resource content, and is the coefficient of resource type.
Then, the brief browsing interest measureis computed as the following: whereis the browsing times,is the each brief browsing time,is the number of bytes of the resource brief, and is the coefficient of resource brief.
Next, the browsing interest measureis computed as the following:
Finally, the mixed interest measureis computed as the following: whereis the total number of the browsed resources.
If, the learning resource is interested; otherwise, the learning resource is uninterested.
4. Cross-Domain Learning Resources Recommendation Algorithm
The basic theory of resources recommendation is analyzing new learning resources for students based on their personalized information and finding their interested resources. At present, traditional machine learning technology is used in most of resources recommendation service, and most of traditional machine learning is based on statistic learning, in which training data and testing data should be under the same distribution.
But in the cross-domain recommendation field, traditional machine learning cannot solve this problem, and the transfer learning is an effective method to solve it.
4.1. Transfer Learning
Now, transfer learning includes instance-based transfer learning and feature-based transfer learning [2–7]. Source domain data and target domain data may have different distribution, yet similar content, in which case, instance-based transfer learning has more obvious effect. The instance-based transfer learning refers to estimate instances from source domain data by some method and applies the instances evaluated better to the learning task of the target domain. Researchers have proposed many instance-based transfer learning algorithms [8–12]. Dai and others  have expanded traditional Boosting ensemble learning algorithm and proposed TrAdaBoost algorithm that has ability to transfer. The basic idea of TrAdaBoost is that the source domain data and the target domain data are mixed, and then they are trained. And after each iteration training, adapt the weight of the instance which is classified to be wrong. If the instance belongs to the target domain, the weight of it which is considered important should be increased and the effect of the sample should be improved at the next iteration training. If the instance belongs to the source domain, the weight of it which is considered unimportant should be decrease and the effect of sample should be reduced at next iteration training.
4.2. Cross-Domain Resources Recommendation Algorithm Design
In TrAdaBoost, the instances are treated in the same way during the initial weight assignment stage, and the weights of source domain instances and the weights of target domain instances are assigned averagely. For learning resources recommendation, the learning resources that the students browsed recently are more likely to reflect the current personalized information, so a time function can be added to weight assignment stage of the algorithm. In this way, the importance of learning resources can be distinguished.
In addition, the instance’s weight which is classified wrong in target domain in TrAdaBoost algorithm will be increased continuously. On the contrary, the instance’s weight which is classified wrong in source domain will be decreased continuously. When the number of instances which have too big or too small weight reaches a certain degree, it will decline the classifier ability of the algorithm. In order to solve this problem, the weight of the wrong classified samples will be constrained.
The detailed description of the cross-domain personalized resource recommendation algorithm is shown below.
Step 1. Let, whereis the learned resources set in the current subject,is the learned resources set in other subjects, and is the category label of.
Step 2. Setand initialize;is the times of iteration, and is the count of iteration.
Step 3. Initialize the weight of instances:
The time functionis as the following:
Step 4. .
Step 5. The weight of sample should be normalized, and it should be adjusted as the following:
Step 6. Call weak learner; then get back a hypothesis.
Step 7. Calculate the error of: Let
Step 8. Update the weights of the instances: where
Step 9. If , go to Step 4; otherwise, get back the final hypothesis:
In this algorithm, the formula (7) has the following properties: the function on [0,+∞) is monotonically decreasing; ; ; the decrease of function is from fast to slow. Its properties satisfy the time characteristics of the learning resources. When, is the time interval between the browsed time of the current instance and the recent browsed time of the instances in. when is the time interval between the browsed time of the current instance and the recent browsed time of the instances in. reflects the descending rate of the instance’s importance. When, the time characteristics of instances are not considered. When, the bigger the , the faster the descending rate of the instance’s importance, and the value of is bigger than.
In formula (11),is the upper threshold of the weight,is the limit threshold of the weight,, and . Theis the maximum value ofis the minimum value ofis a real and can be adjusted. Whenis large enough, it means that the weights of the wrong classified instances are not constrained.
5. Experimental Evaluation
This paper chooses the Literature subject and the History subject to verify the effectiveness of the recommendation method and select the learning logs of a class with 62 students as the experimental data from a network learning system without recommendation service. The learning resources which are placed in “favorite” by students are used as the interested resources, and the learning resources which are placed in “recovery” by students are used as the uninterested resources. The experimental data sets are composed of the resources feature expression and their browsed time. Part of the information is shown in Table 1.
Firstly, test the effectiveness of the personalized information mining based on mixed interest measure; the specific experimental process of the personalized information mining for each student is shown as follows.(1)Extract the records of studentfrom web logs.(2)Extract the records which are corresponding to the resources in.(3)According to the proposed method, calculate the student’s interest measure of each learning resources in.(4)Classify the learning resources according to whether the interest measure is bigger than 1, compare the results with the actual categorical attribute, and calculate the accuracy of the method.
The experimental results are shown in Figure 2.
In 62 groups experimental data, 8 groups are between 84 and 85%, 25 groups are between 85 and 86%, 27 groups are between 86 and 87%, 2 groups are between 87 and 88%, and the average accuracy rate can reach 86.48%, and these meet the demands of personalized learning resources recommendation.
Then, apply the data set that was achieved from personalized information mining to test cross-domain resources recommendation algorithm, and the specific experimental process is as follows.(1)For each data set, randomly select 10% instances from History subject and all instances from Literature subject, use this data set as training set, and use the rest instances of History subject as test set.(2)The values ofare 5 and 106, the values ofare 0.5 and 0.(3)Train and test, and the iteration times are 1 to 50. We should compute the accuracy, precision, and recall.(4)Repeat 10 times, calculate the average of accuracy, precision and recall.(5)Calculate the average of accuracy, precision, and recall of all data sets.
The final comparison results of the average of accuracy, precision, and recall are shown in Table 2.
Using the transfer learning recommendation method, the accuracy, precision and recall are all much higher than the traditional SVM method and higher than the TrSVM method.
It is shown in Table 2 that the accuracy, precision, and recall of TrAdaBoost algorithm are 80.98%, 81.63%, and 80.98%. After adding the time function to the weight assignments stage, the accuracy, precision and recall are 82.11%, 82.15% and 82.06%, just a slight improvment, this is because the algorithm uses Boosting integration learning framework, and so the initial weight assignment has a small effect on the result of final recommendation. When adding weight constraints only, the accuracy, precision, and recall are 85.59%, 86.10%, and 85.59%, better than TrAdaBoost algorithm; this is because there are error data in the data set obtained from personalized information mining. Without weight constraint, the error data will greatly affect the quality of recommendation service, and under the help of the weight constraint, the quality of recommend service is greatly improved. The accuracy, precision, and recall in this paper are 86.54%, 86.63%, and 86.42%, apparently higher than the TrAdaBoost algorithm; the recommendation algorithm in this paper is more effective.
The convergence comparison is shown in Figure 3.
It is shown in Figure 3 that the TrAdaBoost algorithm needs about 45 times of iteration to achieve convergence. While adding time functions or weight constraints, the algorithm needs about 40 times to achieve convergence, and the proposed algorithm in this paper only needs about 35 times of iteration. The proposed algorithm is more effective.
The experimental results show that the recommendation service based on the proposed algorithm in this paper is superior to the recommendation service based on TrAdaBoost algorithm both in quality and efficiency and meets the demands of personalized learning resources recommendation service.
In the personalized learning resources recommendation service, the learning resources recommendation method requires a certain amount of personalized information to achieve a satisfactory effect. How to achieve the information accurately and conveniently and make full use of the information to recommend is an urgent puzzle to be solved. We extract students’ personalized information from web log based on the mix interest measure, put forward a kind of cross-domain learning resources recommendation method, and this method not only can take advantage of the students’ personalized information in current subjects but also can use the personalized information in other subjects they have learned. The experimental results show that the proposed learning resources recommendation method can effectively improve the quality and efficiency of learning resources recommendation.
This work was supported by the National Natural Science Foundation of China (Grants nos. 60973136 and 61073164) and the Youth Research Foundation of Liaoning University (Grant no. 2011LDQN14).
- S. D. Pierrako and G. Paliouras, Web Usage Mining as a Tool for Personalization: A Survey, Kluwer Academic Publishers, 2003.
- S. J. Pan, V. W. Zheng, Q. Yang, and D. H. Hu, “Transfer learning for wifi-based indoor localization,” in Proceedings of the Workshop on Transfer Learning for Complex Task of the 23rd AAAI Conference on Artificial Intelligence, Chicago, Ill, USA, July 2008.
- J. Ramon, K. Driessens, and T. Croonenborghs, “Transfer learning in reinforcement learning problems through partial policy recycling,” in Proceedings of the 18th European Conference on Machine Learning, pp. 699–707, Springer, 2007.
- M. E. Taylor and P. Stone, “Cross-domain transfer for reinforcement learning,” in Proceedings of the 24th International Conference on Machine Learning (ICML '07), pp. 879–886, ACM, June 2007.
- R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, “Self-taught learning: transfer learning from unlabeled data,” in Proceedings of the 24th International Conference on Machine Learning (ICML '07), pp. 759–766, Corvalis, Ore, USA, June 2007.
- H. Daumé III and D. Marcu, “Domain adaptation for statistical classifiers,” Journal of Artificial Intelligence Research, vol. 26, pp. 101–126, 2006.
- Z. Wang, Y. Song, and C. Zhang, “Transferred dimensionality reduction,” in Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD '08), pp. 550–565, Springe, September 2008.
- J. Huang, A. Smola, A. Gretton et al., “Correcting sample selection bias by unlabeled data,” in Proceedings of the 19th Annual Conference on Neural Information Processing Systems, pp. 601–608, MIT Press, 2007.
- M. Sugiyama, S. Nakajima, H. Kashima, P. Von Bünau, and M. Kawanabe, “Direct importance estimation with model selection and its application to covariate shift adaptation,” in Proceedings of the 21st Annual Conference on Neural Information Processing Systems (NIPS '07), pp. 1433–1440, MIT Press, December 2007.
- S. Bickel, C. Sawade, and T. Scheffer, “Transfer learning by distribution matching for targeted advertising,” in Proceedings of the 21th Annual Conference on Neural Information Processing Systems, pp. 145–152, MIT Press, 2009.
- A. Storkey and M. Sugiyama, “Mixture regression for covariate shift,” in Proceedings of the 19th Annual Conference on Neural Information Processing Systems, pp. 1337–1344, MIT Press, 2007.
- J. Hong, J. Yin, Y. Huang, Y. Liu, and J. Wang, “TrSVM: a transfer learning algorithm using domain similarity,” Journal of Computer Research and Development, vol. 48, no. 10, pp. 1823–1830, 2011.
- W. Dai, Q. Yang, G. Xue et al., “Boosting for transfer learning,” in Proceedings of the 24th International Conference on Machine Learning, pp. 193–200, ACM, 2007.
Copyright © 2013 Long Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.