Exploiting Explicit and Implicit Feedback for Personalized Ranking
The problem of the previous researches on personalized ranking is that they focused on either explicit feedback data or implicit feedback data rather than making full use of the information in the dataset. Until now, nobody has studied personalized ranking algorithm by exploiting both explicit and implicit feedback. In order to overcome the defects of prior researches, a new personalized ranking algorithm (MERR_SVD++) based on the newest xCLiMF model and SVD++ algorithm was proposed, which exploited both explicit and implicit feedback simultaneously and optimized the well-known evaluation metric Expected Reciprocal Rank (ERR). Experimental results on practical datasets showed that our proposed algorithm outperformed existing personalized ranking algorithms over different evaluation metrics and that the running time of MERR_SVD++ showed a linear correlation with the number of rating. Because of its high precision and the good expansibility, MERR_SVD++ is suitable for processing big data and has wide application prospect in the field of internet information recommendation.
As e-commerce is growing in popularity, an important challenge is helping customers sort through a large variety of offered products to easily find the ones they will enjoy the most. One of the tools that address this challenge is the recommender system, which is attracting a lot of attention recently [1–4]. Recommender systems are a subclass of information filtering systems that seek to predict the “rating” or “preference” that users would give to an item . Preferences for items are learned from users’ past interactions with the system, such as purchase histories or the click logs. The purpose of the system is to recommend items that users might like from a large collection. Recommender systems have been applied to many areas on the Internet, such as the e-commerce system Amazon, the DVD rental system Netflix, and Google News. Recommender systems are usually classified into three categories based on how recommendations are made: content-based recommendations, collaborative filtering (CF), and hybrid approaches. In these approaches, collaborative filtering is the most widely used and the most successful.
Recently, collaborative filtering algorithm has been widely studied in both academic and industrial fields. The data processed by collaborative filtering algorithm are divided into two categories: explicit feedback data (e.g., ratings, votes) and implicit feedback data (e.g., clicks, purchases). Explicit feedback data are more widely used in the research fields of recommender system [1, 2, 4–7]. They are often in the form of numeric ratings from users to express their preferences regarding specific items. Implicit feedback data are easier to collect. The research on implicit feedback about CF is also called One-Class Collaborative Filtering (OCCF) [8–15], in which only positive implicit feedback or only positive examples can be observed. The explicit and implicit feedback data can be expressed in matrix form as shown in Figure 1. In the explicit feedback matrix, an element can be any real number, but often ratings are integers in the range (1~5), such as the ratings on Netflix, where a missing element represents a missing example. In the implicit feedback matrix, the positive-only user preferences data can be represented as a single-valued matrix.
Collaborative filtering algorithms also can be divided into two categories: collaborative filtering (CF) algorithms based on rating prediction [2, 4, 5, 8, 9, 12, 16, 17] and personalized ranking (PR) algorithms based on ranking prediction [3, 6, 7, 10, 11, 13–15, 18]. In collaborative filtering algorithms based on rating prediction, one predicts the actual rating for an item that a customer has not rated yet and then ranks the items according to the predicted ratings. On the other hand, for personalized ranking algorithms based on ranking prediction, one predicts a preference ordering over the yet unrated items without going through the intermeditate step of rating prediction. Actually, from the recommendation perspective, the order over the items is more important than their rating. Therefore, in this paper, we focus on personalized ranking algorithms based on ranking prediction.
The problem of the previous researches on personalized ranking is that they focused on either explicit feedback data or implicit feedback data rather than making full use of the information in the dataset. However, in most real world recommender systems both explicit and implicit user feedback are abundant and could potentially complement each other. It is desirable to be able to unify these two heterogeneous forms of user feedback in order to generate more accurate recommendations. The idea of complementing explicit feedback with implicit feedback was first proposed in , where the author considered explicit feedback as how the users rated the movies and implicit feedback as what movies were rated by the users. The two forms of feedback were combined via a factorized neighborhood model (called Singular Value Decomposition++, SVD++), an extension of traditional nearest item-based model in which the item-item similarity matrix was approximated via low rank factorization. In order to make full use of explicit and implicit feedback, Liu et al.  proposed Co-Rating model, which developed matrix factorization models that could be trained from explicit and implicit feedback simultaneously. Both SVD++ and Co-Rating are based on rating prediction. Until now, nobody has studied personalized ranking algorithm by exploiting both explicit and implicit feedback.
In order to overcome the defects of prior researches, this paper proposes a new personalized ranking algorithm (MERR_SVD++), which exploits both explicit and implicit feedback and optimizes Expected Reciprocal Rank () based on the newest Extended Collaborative Less-Is-More Filtering (xCLiMF) model  and SVD++ algorithm. Experimental results on practical datasets showed that our proposed algorithm outperformed existing personalized ranking algorithms over different evaluation metrics and that the running time of MERR_SVD++ showed a linear correlation with the number of rating. Because of its high precision and the good expansibility, MERR_SVD++ is suitable for processing big data and has wide application prospect in the field of internet information recommendation.
The rest of this paper is organized as follows: Section 2 introduces previous related work; Section 3 demonstrates the problem formalization and SVD++ algorithm; a new personalized ranking algorithm (MERR_SVD++) is proposed in Section 4; the experimental results and discussion are presented in Section 5, followed by the conclusion and future work in Section 6.
2. Related Work
2.1. Rating Prediction
In conventional CF tasks, the most frequently used evaluation metrics are the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE). Therefore, rating prediction (such as the Netflix Prize) has been the most popular method for solving the CF problem. Rating prediction methods are always regression based: they minimize the error of predicted ratings and true ratings. The simplest algorithm for rating prediction is -Nearest-Neighbor (KNN) , which predicts the missing ratings from the neighborhood of users or items. KNN is a memory-based algorithm, and one needs to compute all the similarities between different users or items. More efficient algorithms are model based: they build a model from the visible ratings and compute all the missing ratings from the model. Widely used model-based rating prediction methods include PLSA , the Restricted Boltzmann Machine (RBM) , and a series of matrix factorization techniques [22–25].
2.2. Learning to Rank
LTR is the core technology for information retrieval. When a query is input into a search engine, LTR is responsible for ranking all the documents or Web pages according to their relevance to this query or other objectives. Many LTR algorithms have been proposed recently, and they can be classified into three categories: pointwise, listwise, and pairwise [3, 26].
In the pointwise approach, it is assumed that each query-document pair in the training data has a numerical or ordinal score. Then the LTR problem can be approximated by a regression problem: given a single query-document pair, its score is predicted.
As the name suggests, the listwise approach takes the entire set of documents associated with a query in the training data as the input to construct a model and predict their scores.
The pairwise approach does not focus on accurately predicting the degree of relevance of each document; instead, it mainly cares about the relative order of two documents. In this sense, it is closer to the concept of “ranking.”
2.3. Personalized Ranking Algorithms for Collaborative Filtering
The algorithms about personalized ranking can also be divided into two categories: personalized ranking with implicit feedback (PRIF) [6, 7, 10, 11, 13–15] and personalized ranking with explicit feedback (PREF) [18, 27–29]. The foremost of PRIF is Bayesian Personalized Ranking (BPR) , which converts the OCCF problem into a ranking problem. Pan et al.  proposed Adaptive Bayesian Personalized Ranking (ABPR), which generalized BPR algorithm for homogeneous implicit feedback and learned the confidence adaptively. Pan et al.  have proposed Group Bayesian Personalized Ranking (GBPR), via introducing richer interactions among users. In GBPR, it introduces group preference, to relax the individual and independence assumptions of BPR. Jahrer and Toscher  proposed SVD and AFM, which used a ranking based objective function constructed by matrix decomposition model and a stochastic gradient descent optimizer. Takács and Tikk  proposed RankALS, which presented a computationally effective approach for the direct minimization of a ranking objective function without sampling. Gai  proposed a new PRIF algorithm (Pairwise Probabilistic Matrix Factorization (PPMF)) to further improve the performance of previous PRIF algorithms. Recently, Shi et al.  proposed Collaborative Less-is-More Filtering (CLiMF), in which the model parameters were learned by directly maximizing the well-known information retrieval metric: Mean Reciprocal Rank (MRR). However, CLiMF is not suitable for other evaluation metrics (e.g., MAP, AUC, and NDCG). References [6, 7, 10, 11, 13–15] can improve the performance of OCCF by solving the data sparsity and imbalance problems of PRIF to a certain extent. As for PREF,  was adapted from PLSA and  employed the KNN technique of CF, both of which utilized the pairwise ranking method. Reference  utilized the listwise method to build its ranker, which was a variation of Maximum Margin Factorization . Shi et al.  proposed Extended Collaborative Less-Is-More Filtering (xCLiMF) model, which could be seen as a generalization of the CLiMF method. The key idea of the xCLiMF algorithm is that it builds a recommendation model by optimizing Expected Reciprocal Rank, an evaluation metric that generalizes Reciprocal Rank (RR) in order to incorporate user’ explicit feedback. References [18, 27–29] can also improve the performance of PREF by solving the data sparsity and imbalance problems of PREF to a certain extent.
3. Problem Formalization and SVD++
3.1. Problem Formalization
In this paper, we use capital letters to denote a matrix (such as ). Given a matrix , represents its element, indicates the th row of , symbolizes the th column of , and stands for the transpose of .
Given that a matrix , the total number of users is , and the total number of items is , if is an explicit feedback matrix, then or is unknown. We want to approximate with a low rank matrix , where , , , and denote the explicit feature matrix with ranks of for users and items, respectively, , and denotes the rank of , .
If is an implicit feedback matrix, then or is unknown. We want to approximate with a low rank matrix , where , , , and denote the implicit feature matrix with ranks of for users and items, respectively.
SVD++ is a collaborative filtering algorithm unifying explicit and implicit feedback based on rating prediction and matrix factorization . In SVD++, matrix denotes the explicit and implicit feature matrix of items simultaneously.
The feature matrix of users can be defined aswhere is used to characterize the user’s explicit feedback, is used to characterize the user’s implicit feedback, denotes the set of all items that user gave implicit feedback, and denotes the implicit feature vector of item .
So the prediction formula of in SVD++ can be defined as
4. Exploiting Explicit and Implicit Feedback for Personalized Ranking
In this section, we will firstly introduce our MERR_SVD++ model, then give the learning algorithm of this model, and finally analyze its computational complexity.
4.1. Exploiting Explicit and Implicit Feedback for Personalized Ranking
In practical applications, the user scans the results list from top to bottom and stops when a result is found that fully satisfies the user’s information need. The usefulness of an item at rank is dependent on the usefulness of the items at rank less than . Reciprocal Rank (RR) is an important evaluation metric in the research field of information retrieval , and it strongly emphasizes the relevance of results returned at the top of the list. The ERR measure is a generalized version of RR designed to be used with multiple relevance level data (e.g., ratings). ERR has similar properties to RR in that it strongly emphasizes the relevance of results returned at the top of the list.
Using the definition of ERR in [18, 30], we can describe ERR for a ranked item list of user as follows:where denotes the rank position of item for user , when all items are ranked in descending order of the predicted relevance values. And denotes the probability that the user stops at position . As in , is defined as follows:where is an indicator function, equal to 1 if the condition is true, and otherwise 0. And denotes the probability that user finds the item relevant. Substituting (4) into (3), we obtain the calculation formula of :We use a mapping function similar to the one used in , to convert ratings (or levels of relevance in general) to probabilities of relevance, as follows:where is an indicator function. Note that () indicates that user ’s preference to item is known (unknown). denotes the rating given by user to item , and is the highest rating.
In this paper, we define that denotes the set of all items that user gave explicit feedback, so in the dataset that the users only gave explicit feedback, and the implicit feedback dataset is created by setting all the rating data in explicit feedback dataset as 1. A toy example can be seen in Figure 2. If the dataset contains both explicit feedback data and implicit feedback data, , denoting the set of all items that user only gave implicit feedback. A toy example can be seen in Figure 3. The influence of implicit feedback on the performance of MERR_SVD++ can be found in Section 5.4.2. If we use the traditional SVD++ algorithm to unify explicit and implicit feedback data, the prediction formula of in SVD++ can be defined as
So far, through the introduction of SVD++, we can exploit both explicit and implicit feedback simultaneously by optimizing evaluation metric ERR. So we call our model MERR_SVD++.
Note that the value of rank depends on the value of . For example, if the predicted relevance value of item is the second highest among all the items for user , then we will have .
Similar to other ranking measures such as RR, ERR is also nonsmooth with respect to the latent factors of users and items, that is, , , and . It is thus impossible to optimize ERR directly using conventional optimization techniques. We thus employ smoothing techniques that were also used in CLiMF  and xCLiMF , to attain a smoothed version of ERR. In particular we approximate the rank-based terms and in (5) by smooth functions with respect to the model parameters , , and . The approximate formula is as follows:where is a logistic function, that is, .
Given the monotonicity of the logarithm function, the model parameters that maximize (9) are equivalent to the parameters that maximize . Specifically, we have
Based on Jensen’s inequality and the concavity of the logarithm function in a similar manner to [14, 18], we derive the lower bound of as follows:We can neglect the constant in the lower bound and obtain a new objective function:Taking into account all users and using the Frobenius norm of the latent factors for regularization, we obtain the objective function of MERR_SVD++:in which denotes the regularization coefficient and denotes the Frobenius norm of . Note that the lower bound is much less complex than the original objective function in (9), and standard optimization methods, for example, gradient ascend, can be used to learn the optimal model parameters , , and .
We can now maximize the objective function (13) with respect to the latent factors . Note that represents an approximation of the mean value of ERR across all the users. We can thus remove the constant coefficient , since it has no influence on the optimization of . Since the objective function is smooth we can use gradient ascent for the optimization. The gradients can be derived in a similar manner to xCLiMF , as shown in the following:The learning algorithm for the MERR_SVD++ model is outlined in Algorithm 1.
The published research papers in [8–11, 13, 18] show that the use of the normal distribution for the initialization of feature matrix is very effective, so we still use this approach for the initialization of , , and in our proposed algorithm.
4.3. Computational Complexity
Here, we first analyze the complexity of the learning process for one iteration. By exploiting the data sparseness in , the computational complexity of the gradient in (14) is . Note that denotes the average number of relevant items across all the users. The computational complexity of the gradient in (16) is also . The computational complexity of the gradient in (15) is . Hence, the complexity of the learning algorithm in one iteration is in the order of . In the case that is a small number, that is, , the complexity is linear to the number of users in the data collection. Note that we have , in which denotes the number of nonzeros in the user-item matrix. The complexity of the learning algorithm is then . Since we usually have , the complexity is even in the case that is large, that is, being linear to the number of nonzeros (i.e., relevant observations in the data). In sum, our analysis shows that MERR_SVD++ is suitable for large scale use cases. Note that we also empirically verify the complexity of the learning algorithm in Section 5.4.3.
We use two datasets for the experiments. The first is the MovieLens 1 million dataset (ML1m) , which contains ca. 1M ratings (1–5 scale) from ca. 6K users and 3.7K movies. The sparseness of the ML1m dataset is 95.53%. The second dataset is the Netflix dataset [2, 16, 17], which contains 100,000,000 ratings (1–5 scale) from 480,189 users on 17,770 movies. Due to the huge size of the Netflix data, we extract a subset of 10,000 users and 10,000 movies, in which each user has rated more than 100 different movies.
5.2. Evaluation Metrics
Just as has been justified by , NDCG is a very suitable evaluation metric for personalized ranking algorithms which combine explicit and implicit feedback. And our proposed MERR_SVD++ algorithm exploits both explicit and implicit feedback simultaneously and optimizes the well-known personalized ranking evaluation metric Expected Reciprocal Rank (ERR). So we use the NDCG and ERR as evaluation metrics for the predictability of models in this paper.
NDCG is another most widely used measure for ranking problems. To define for a user , one first needs to define :where is a binary indicator returning 1 if the th item is preferred and 0 otherwise. is then normalized by the ideal ranked list into the interval :where denotes that the ranked list is sorted exactly according to the user’s tastes: positive items are placed at the head of the ranked list. The NDCG of all the users is the mean score of each user.
ERR is a generalized version of Reciprocal Rank (RR) designed to be used with multiple relevance level data (e.g., ratings). It has similar properties to RR in that it strongly emphasizes the relevance of results returned at the top of the list. Using the definition of ERR in , we can define ERR for a ranked item list of user as follows:Similar to NDCG, the ERR of all the users is the mean score of each user.
Since in recommender systems the user’s satisfaction is dominated by only a few items on the top of the recommendation list, our evaluation in the following experiments focuses on the performance of top-5 recommended items, that is, NDCG@5 and ERR@5.
5.3. Experiment Setup
For each dataset, we randomly selected 5 rated items (movies) and 1,000 unrated items (movies) for each user to form a test set. We then randomly selected a varying number of rated items from the rest to form a training set. For example, just as in [14, 18], under the condition of “Given 5,” we randomly selected 5 rated items (disjoint to the items in the test set) for each user in order to generate a training set. We investigated a variety of “Given” conditions for the training sets, that is, 5, 10, and 15 for the ML1m dataset and 10, 20, and 30 for the extracted Netflix dataset. Generated recommendation lists for each user are compared to the ground truth in the test set in order to measure the performance.
All the models were implemented in MATLAB R2009a. For MERR_SVD++, the value of the regularization parameter was selected from range and optimal parameter value was used. And the learning rate was selected from set , , and the optimal parameter value was also used. In order to compare their performances fairly, for all matrix factorization models we set the number of features to be 10. The optimal values of all parameters for all the baseline models used are determined individually. More detailed setting methods of the parameters for all the baselines can be found in the corresponding references. For all the algorithms used in our experiments, we repeated the experiment 5 times for each of the different conditions of each dataset, and the performances reported were averaged across 5 runs.
5.4. Experiment Results
In this section we present a series of experiments to evaluate MERR_SVD++. We designed the experiments in order to address the following research questions:(1)Does the proposed MERR_SVD++ outperform state-of-the-art personalized ranking approaches for top-N recommendation?(2)Does the performance of MERR_SVD++ improve when we only increase the number of implicit feedback data for each user?(3)Is MERR_SVD++ scalable for large scale use cases?
5.4.1. Performance Comparison
We compare the performance of MERR_SVD++ with that of five baseline algorithms. The approaches we compare with are listed below:(i)Co-Rating : a state-of-the-art CF model that can be trained from explicit and implicit feedback simultaneously.(ii)SVD++ : the first proposed CF model that combines explicit and implicit feedback.(iii)xCLiMF : a state-of-the-art PR approach which aims at directly optimizing ERR, for top-N recommendation in domains with explicit feedback data (e.g., ratings).(iv)CofiRank : a PR approach that optimizes the NDCG measure  for domains with explicit feedback data (e.g., ratings). The implementation is based on the publicly available software package from the authors.(v)CLiMF : a state-of-the-art PR approach that optimizes the Mean Reciprocal Rank (MRR) measure  for domains with implicit feedback data (e.g., click, follow). We use the explicit feedback datasets by binarizing the rating values with a threshold. On the ML1m dataset and the extracted Netflix dataset, we take ratings 4 and 5 (the highest two relevance levels), respectively, as the relevance threshold for top-N recommendation.
The results of the experiments on the ML1m and the extracted Netflix datasets are shown in Figure 4. Rows denote the varieties of “Given” conditions for the training sets based on the ML1m and the extracted Netflix datasets and columns denote the quality of NDCG and ERR. Figure 4 shows that MERR_SVD++ outperforms the baseline approaches in terms of both ERR and NDCG in all of the cases. The results show that the improvement of ERR aligns consistently with the improvement of NDCG, indicating that optimizing ERR would not degrade the utility of recommendations that are captured by the NDCG measure. It can be seen that the relative performance of MERR_SVD++ improves as the number of observed ratings from the users increases. This result indicates that MERR_SVD++ can learn better top-N recommendation models if more observations of the graded relevance data from users can be used. The results also reveal that it is difficult to model user preferences encoded in multiple levels of relevance with limited observations, in particular, when the number of observations is lower than the number of relevance levels.
Compared to Co-Rating, which is based on rating prediction, MERR_SVD++ is based on ranking prediction and succeeds in enhancing the top-ranked performance by optimizing ERR. As reported in , the performance of SVD++ is slightly weaker than that of Co-Rating, which is because SVD++ model only attempts to approximate the observed ratings and does not model preferences expressed in implicit feedback. The results also show that Co-Rating and SVD++ significantly outperform xCLiMF and CofiRank, which confirms our belief that implicit feedback could indeed complement explicit feedback. It can be seen in Figure 4 that xCLiMF significantly outperforms CLiMF in terms of both ERR and NDCG, across all the settings of relevance thresholds and datasets. The results indicate that the information loss from binarizing multilevel relevance data would inevitably make recommendation models based on binary relevance data, such as CLiMF, suboptimal for the use cases with explicit feedback data.
Hence, we give a positive answer to our first research question.
5.4.2. The Influence of Implicit Feedback on the Performance of MERR_SVD++
The influence of implicit feedback on the performance of MERR_SVD++ can be found in Figure 5. Here, . denotes the set of all items that user only gave implicit feedback. Rows denote the increased numbers of implicit feedback for each user and columns denote the quality of ERR and NDCG. In our experiment, the increased numbers of implicit feedback for each user are the same. We use the extracted Netflix dataset under the condition of “Given 10.” Figure 5 shows that the quality of ERR and NDCG of MERR_SVD++ synchronously and linearly improves with the increase of implicit feedback for each user, which confirms our belief that implicit feedback could indeed complement explicit feedback.
With this experimental result, we give a positive answer to our second research question.
The last experiment investigated the scalability of MERR_SVD++, by measuring the training time that was required for the training set at different scales. Firstly, as analyzed in Section 4.3, the computational complexity of MERR_SVD++ is linear in the number of users in the training set when the average number of items rated per user is fixed. To demonstrate the scalability, we used different numbers of users in the training set under each condition: we randomly selected from 10% to 100% users in the training set and their rated items as the training data for learning the latent factors. The results on the ML1m dataset are shown in Figure 6. We can observe that the computational time under each condition increases almost linearly to the increase of the number of users. Secondly, as also discussed in Section 4.3, the computational complexity of MERR_SVD++ could be further approximated to be linear to the amount of known data (i.e., nonzero entries in the training user-item matrix). To demonstrate this, we examined the runtime of the learning algorithm against different scales of the training sets under different “Given” conditions. The result is shown in Figure 6, from which we can observe that the average runtime of the learning algorithm per iteration increases almost linearly as the number of nonzeros in the training set increases.
The observations from this experiment allow us to answer our last research question positively.
6. Conclusion and Future Work
The problem of the previous researches on personalized ranking is that they focused on either explicit feedback data or implicit feedback data rather than making full use of the information in the dataset. Until now, nobody has studied personalized ranking algorithm by exploiting both explicit and implicit feedback. In order to overcome the defects of prior researches, in this paper we have presented a new personalized ranking algorithm (MERR_SVD++) by exploiting both explicit and implicit feedback simultaneously. MERR_SVD++ optimizes the well-known evaluation metric Expected Reciprocal Rank (ERR) and is based on the newest xCLiMF model and SVD++ algorithm. Experimental results on practical datasets showed that our proposed algorithm outperformed existing personalized ranking algorithms over different evaluation metrics and that the running time of MERR_SVD++ showed a linear correlation with the number of rating. Because of its high precision and the good expansibility, MERR_SVD++ is suitable for processing big data and can greatly improve the recommendation speed and validity by solving the latency problem of personalized recommendation and has wide application prospect in the field of internet information recommendation. And because MERR_SVD++ exploits both explicit and implicit feedback simultaneously, MERR_SVD++ can solve the data sparsity and imbalance problems of personalized ranking algorithms to a certain extent.
For future work, we plan to extend our algorithm to richer ones, so that our algorithm can solve the grey sheep problem and cold start problem of personalized recommendation. Also we would like to explore more useful information from the explicit feedback and implicit feedback simultaneously.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is sponsored in part by the National Natural Science Foundation of China (nos. 61370186, 61403264, 61402122, 61003140, 61033010, and 61272414), Science and Technology Planning Project of Guangdong Province (nos. 2014A010103040 and 2014B010116001), Science and Technology Planning Project of Guangzhou (nos. 2014J4100032 and 201510010203), the Ministry of Education and China Mobile Research Fund (no. MCM20121051), the second batch open subject of mechanical and electrical professional group engineering technology development center in Foshan city (no. 2015-KJZX139), and the 2015 Research Backbone Teachers Training Program of Shunde Polytechnic (no. 2015-KJZX014).
S. Ahn, A. Korattikara, and N. Liu, “Large-scale distributed Bayesian matrix factorization using stochastic gradient MCMC,” in Proceedings of the 21th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 401–410, ACM Press, Sydney, Australia, August 2015.View at: Google Scholar
Q. Yao and J. Kwok, “Accelerated inexact soft-impute for fast large-scale matrix completion,” in Proceedings of the 24rd International Joint Conference on Artificial Intelligence, pp. 4002–4008, ACM Press, Buenos Aires, Argentina, July 2015.View at: Google Scholar
Q. Diao, M. Qiu, C.-Y. Wu, A. J. Smola, J. Jiang, and C. Wang, “Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS),” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '14), pp. 193–202, ACM, New York, NY, USA, August 2014.View at: Publisher Site | Google Scholar
M. Jahrer and A. Jahrer and Toscher, “Collaborative filtering ensemble for ranking,” in Proceedings of the 17nd International Conference on Knowledge Discovery and Data Mining, pp. 153–167, ACM, San Diego, Calif, USA, August 2011.View at: Google Scholar
G. Takács and D. Takács and Tikk, “Alternating least squares for personalized ranking,” in Proceedings of the 5th ACM Conference on Recommender Systems, pp. 83–90, ACM, Dublin, Ireland, 2012.View at: Google Scholar
H. Wang, X. Shi, and D. Yeung, “Relational stacked denoising autoencoder for tag recommendation,” in Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 3052–3058, ACM Press, Austin, Tex, USA, January 2015.View at: Google Scholar
S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “BPR: Bayesian personalized ranking from implicit feedback,” in Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI '09), pp. 452–461, Morgan Kaufmann, Montreal, Canada, 2009.View at: Google Scholar
Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, N. Oliver, and A. Hanjalic, “CLiMF: collaborative less-is-more filtering,” in Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI '13), pp. 3077–3081, ACM, Beijing, China, August 2013.View at: Google Scholar
W. K. Pan and L. Chen, “GBPR: group preference based bayesian personalized ranking for one-class collaborative filtering,” in Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI '13), pp. 2691–2697, Beijing, China, 2013.View at: Google Scholar
N. N. Liu, E. W. Xiang, M. Zhao, and Q. Yang, “Unifying explicit and implicit feedback for collaborative filtering,” in Proceedings of the 19th International Conference on Information and Knowledge Management (CIKM '10), pp. 1445–1448, ACM, Toronto, Canada, October 2010.View at: Publisher Site | Google Scholar
Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, and A. Hanjalic, “XCLiMF: Optimizing expected reciprocal rank for data with multiple levels of relevance,” in Proceedings of the 7th ACM Conference on Recommender Systems (RecSys '13), pp. 431–434, ACM, Hongkong, October 2013.View at: Publisher Site | Google Scholar
R. Salakhutdinov, A. Mnih, and G. Hinton, “Restricted Boltzmann machines for collaborative filtering,” in Proceedings of the 24rd International Conference on Machine Learning (ICML '07), pp. 791–798, ACM Press, Corvallis, Ore, USA, June 2007.View at: Google Scholar
N. Srebro, J. Rennie, and T. Jaakkola, “Maximum-margin matrix factorization,” in Proceedings of the 18th Annual Conference on Neural Information Processing Systems, pp. 11–217, British Columbia, Canada, 2004.View at: Google Scholar
D. Feldman and T. Tassa, “More constraints, smaller coresets: constrained matrix approximation of sparse big data,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15), pp. 249–258, ACM Press, Sydney, Australia, August 2015.View at: Publisher Site | Google Scholar
S. Mirisaee, E. Gaussier, and A. Termier, “Improved local search for binary matrix factorization,” in Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 1198–1204, ACM Press, Austin, Tex, USA, January 2015.View at: Google Scholar
T. Y. Liu, Learning to Rank for Information Retrieval, Springer, New York, NY, USA, 2011.
N. N. Liu and Q. Yang, “EigenRank: a ranking-oriented approach to collaborative filtering,” in Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR '08), pp. 83–90, ACM, Singapore, July 2008.View at: Publisher Site | Google Scholar
M. Weimer, A. Karatzoglou, Q. V. Le, and A. J. Smola, “CofiRank–maximum margin matrix factorization for collaborative ranking,” in Proceedings of the 21th Conference on Advances in Neural Information Processing Systems, pp. 79–86, Curran Associates, Vancouver, Canada, 2007.View at: Google Scholar
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan, “Expected reciprocal rank for graded relevance,” in Proceedings of the 18th ACM International Conference on Information and Knowledge Management (CIKM '09), pp. 621–630, ACM, New York, NY, USA, November 2009.View at: Publisher Site | Google Scholar
E. M. Voorhees, “The trec-8 question answering track report,” in Proceedings of the 8th Text Retrieval Conference (TREC-8 '99), Gaithersburg, Md, USA, November 1999.View at: Google Scholar