Abstract

Recommender systems are used to make recommendations about products, information, or services for users. Most existing recommender systems implicitly assume one particular type of user behavior. However, they seldom consider user-recommender interactive scenarios in real-world environments. In this paper, we propose a hybrid recommender system based on user-recommender interaction and evaluate its performance with recall and diversity metrics. First, we define the user-recommender interaction. The recommender system accepts user request, recommends N items to the user, and records user choice. If some of these items favor the user, she will select one to browse and continue to use recommender system, until none of the recommended items favors her. Second, we propose a hybrid recommender system combining random and k-nearest neighbor algorithms. Third, we redefine the recall and diversity metrics based on the new scenario to evaluate the recommender system. Experiments results on the well-known MovieLens dataset show that the hybrid algorithm is more effective than nonhybrid ones.

1. Introduction

Recommender systems (RSs) have been extensively studied to present items, such as movies, music, and books [13]. There are three main types of recommendation methods: memory-based, model-based, and hybrid [4]. Memory-based methods [4, 5] usually use similarity metrics to obtain the distance between two users or two items. Model-based methods [4, 6, 7] use demographic, content, or aggregated information to create a model that generates recommendations. Hybrid methods [810] combine different types of recommenders to gain better performance [11].

Most existing RSs implicitly assume one particular type of user behavior. These behaviors include browsing, rating, and sequence-independent manner. The browsing behavior [12, 13] is that the user only specifies which items are browsed. The rating behavior [14, 15] is that the user specifies the score to items. The sequence-independent manner [16] is not related to the sequence of recommended items. However, these RSs seldom consider user-recommender interactive scenarios in real-world environments.

In this paper, we propose a hybrid recommender system based on user-recommender interaction. The interaction provides a framework for user-recommender behavior and recommender algorithms. We employ the random algorithm to deal with cold-start problem in the face of data sparsity. The hybrid algorithm is applied to incremental recommendation when the data reaches a certain degree. The sequence of the user’s choice affects subsequent recommendation in the process of user-recommender interaction. Consequently, the interaction forms an active learning scenario.

This work has four aspects. First, we define the user-recommender interaction. The recommender behavior has three steps: (1) accept user request, (2) recommend items to user with certain algorithm, and (3) record user choice for further recommendation. The user behavior has three steps: (1) get recommendations of items, (2) compare with her interest, and (3) select one to browse. Second, we build an interactive hybrid RS based on active and incremental learning. In the initialization phase, no record of recommendation exists. The items are recommended through a random algorithm. In the transition stage, the data is very sparse. The items are recommended based on hybrid algorithm of both random and -nearest neighbors (NN). In the stable stage, the scale of data reaches a certain degree. The items are recommended mainly through the NN algorithm. Third, we redefine the recall and diversity metrics [17] under the new scenario to evaluate the quality of the RS. The number of recommendations in each turn essentially serves as a constraint. Fourth, we test the random, NN, and our hybrid algorithm with the new metrics.

The contribution of the paper is threefold. (1) We define the interactive behavior between user and recommender. The new scenario has practical significance because it is closer to the real world. (2) The common accuracy based on error metrics (such as RMSE, MAE [18]) is not a natural fit for evaluating the interactive RS. We redefine two metrics of the recall and diversity [19] based on the new scenario to evaluate our RS. (3) We test the hybrid algorithm in the process of interactive recommender.

Experiments on the well-known MovieLens dataset (http://www.movielens.org/) show that (1) when the ratio of random recommendations is not too big, it has no great influence on the performance; (2) the RS should employ NN algorithm to recommender as early as possible; (3) the hybrid algorithm performs better than the nonhybrid one.

The rest of the paper is organized as follows. Section 2 makes the assumptions for interactive recommendation scenario. The user and recommender behaviors are defined in our interactive scenario. Section 3 builds a new hybrid recommender system based on the user-recommender interaction. Section 4 presents experimental results on the MovieLens dataset with two metrics of the recall and diversity. The appropriate parameters setting of our hybrid algorithm is discussed in detail. Section 5 briefly presents some of the research literature related to user-recommender interaction, active learning, NN, hybrid filtering, and granular computing. Finally, concluding remarks are made in Section 6.

2. The Interactive Scenario

In this section, we make the assumptions for interactive recommendation scenario and define the browse user and recommender.

2.1. Binary Relations for Recommender Systems

In this subsection, we revisit the definition of binary relations in information systems [18].

Definition 1. Let and be two sets of objects. Any is a binary relation from to .

An example of binary relation is given in Table 1, where is the set of users and is the set of movies. A binary relation can be viewed as an information system. However, in order to save space, it is more often stored in the database as a table with two foreign keys.

2.2. Assumptions

We make the following assumptions to simplify the scenario.(1)The user interest is deterministic. Given a set of items, the user is interested in a fixed subset. This may be different from the reality where the user interest changes over time or is influenced by the recommendation sequence.(2)The items set does not change. In slow evolving applications, this might be true. In rapid changing applications such as e-commerce, this is not the case since items are added and removed frequently.(3)The users set does not change. This is similar to the case of the items set.(4)The RS starts from the initial state where there is no browse history. In other words, all values of the browse matrix are 0 at the beginning.(5)A fixed number of items are recommended to the user in each round. Let be the number of items recommended to a user each time. For example, a fixed number of website links are shown in one web-search page.

For the first assumption, we can define a user interest matrix as follows.

Definition 2. Let be the set of all users and the set of all items. The user interest matrix is where

2.3. User and Recommender Behaviors

The user-recommender interaction is a mutual action between the login user and recommender system. The user first logins on the system. The system then returns one or multiple recommended items. The user selects an item as her choice or terminates the interaction.

Users of an RS can be divided into two types according to their feedback. If the user only specifies which items are browsed, she will be called a browse user. If the user specifies the rating to items, she will be called a rate user. We only consider browse user throughout the paper.

The user behavior is illustrated in Figure 1. A browse user logs on the system and gets recommendations. The user can obtain a candidate browse set according to her interest and the recommendations. We define the candidate browse set as follows.

Definition 3. Let be the interests set of user and the recommended set. is the candidate browse set.

If , is interested in some of these items. Let be the recommended list. The user will browse the first match of and continue to use the system. We define the first match as follows.

Definition 4. If and , is the first match.

If , will quit the system since the recommended items do not match her interest. The user has no patience to continue to use the RS. Since the number of items is limited, the browse user will eventually quit.

We define a grey list for improving the effectiveness of recommendations. Based on Definition 4, we consider the list to be not interesting for . All items of are added into the grey list of and will not be recommended again.

After all users quit the system, we obtain a matrix recording whose items have been browsed by the users. This is called the browse matrix and given by where Note that the browse matrix is influenced by not only the recommender system but also the user behavior such as the period between two browse actions.

In most cases, we assume that the user cannot visit the RS again after she quit the system. However, some users may be very patient. They can visit the RS repeatedly. We define the kind of user behavior as revisit.

The recommender behavior is illustrated in Figure 2. The RS accepts user login and request and recommends items to the browse user. If the user has browsed one item, the RS records her choice to the browse matrix. The number of browsed items will increase due to more interactions. This is also related to incremental learning. If the user cannot discover her interests from the recommendations, she will quit the system. With all users quitting, the system will halt.

3. Hybrid Recommendation for the Interactive Scenario

In this section, we build a new hybrid RS based on the user-recommender interaction. There are three aspects: (1) a new hybrid approach is designed to achieve a tradeoff between the interactive recall and diversity; (2) the model of user-recommender interaction is built based on the new hybrid algorithm; (3) a number of parameters are proposed to adjust our algorithm.

3.1. The Hybrid Recommendation Algorithm

Our hybrid algorithm consists of random and NN ones. The random algorithm is used to solve cold-start problem and keep the diversity of RS. The NN algorithm finds similar users through computing the cosine value. Each neighbor recommends a certain number of items. Let be the threshold of NN recommendations. It determines whether or not to use the NN algorithm. Based on (4), the number of browsed items by user is . This is called the NN switch and given by

The hybrid algorithm is the core of the RS. It has three stages based on the data characteristics of browsed matrix. There are three characteristics in the initialization stage: (1) the browsed matrix is null; (2) the browsed items by the login user are null; (3) the number of items that any user browsed is not more than .

There are two characteristics of browsed matrix in the transition stage: (1) browsed matrix is not null and (2) the number of browsed items over is greater than zero, but the number of recommended items is not enough to trigger the NN algorithm. The NN algorithm is used to recommend a few items, and the other ones are recommended based on random algorithm.

In the stable stage, the number of browsed items over is enough to meet the NN algorithm to recommend sufficient items. The items are recommended mainly based on NN algorithm. For the diversity of recommendations, the random algorithm is used to recommend a certain proportion of items.

An incremental hybrid algorithm is given by Algorithm 1. The input includes the user id () and three user-specified parameters. These are the total recommended number (), the ratio of random recommendations (), and . It essentially has four steps.

Input: , , ,
Output: recommender  items()
Method:
(1) = the  browsed  items  by   ;
(2) = the  browser  matrix  of  all  users;
(3)if  (  or  )  then
(4)  = recommended     items  by  random  algorithm;
(5)return  ;
(6)end if
(7);
(8) = ;
(9)for  each   do
(10)   = the  number  of  browsed  items  by   ;
(11)  if     then
(12)    = ;
(13)  end if
(14) end for
(15) if    then
(16)   = recommended     items  by  random  algorithm;
(17)  return  ;
(18) end if
(19) = ;
(20) = recommended     items  by  kNN  algorithm;
(21) = ;
(22) = recommended     items  by  random  algorithm;
(23) = ;
(24) return  ;

Step  1. In the initialization stage, there is not any history data of the browse matrix. We adopt the random algorithm to deal with the cold-start problem. This step corresponds to Lines 3–6 of the algorithm.

Step  2. is used to determine whether or not to use the NN algorithm. We first count the number of browsed items by . If , the NN algorithm is employed. This step corresponds to Lines 8–14 of the algorithm.

Step  3. When the recommender is running smoothly, the random algorithm is used to recommend some new items for exploring new interests. This step corresponds to Lines 15–18 of the algorithm.

Step  4. The NN algorithm is used to recommend some items. The other items are recommended based on random algorithm in transition and stable stages. This step corresponds to Lines 17–23 of the algorithm.

The output of the algorithm is stored in the memory for Algorithm 2.

Input: , ,
Output: Recall and diversity
Method:
(1) =   is  the  flag  of  successfully  recommender;
(2) = ;
(3) = an  randomized  array  of   ;
(4) = ;
(5)while    do
(6);
(7)for  each   do
(8)   = hybridAlgorithm();
(9)   = the  interestarray  of   ;
(10)   if    then
(11)    the  user  quits  to  the  RS;
(12)   else
(13)    the  browsed  item  is  recorded  into  bm();
(14)    ;
(15)   end if
(16)  end for
(17) end while
(18) ;
(19) ;
(20) ;
(21) return  recall  and  diversity;

3.2. The User-Recommender Interaction

The model of user-recommender interaction is built in our RS. The RS is divided into the user and recommender parts. The user logs in and browses items. The main work is fulfilled by the recommender part.

For each user, the user-recommender interaction is as follows: (1) the user gets recommendations from the RS; (2) whether or not the user browses an item is based on her own interest list. If the user browses an item, the process continues. Otherwise, she is not interested in these items recommended by RS; she will quit immediately.

For our hybrid RS, the user-recommender interaction is as follows: (1) the RS accepts user request and recommends items to the user based on hybrid algorithm; (2) if the user has browsed one of the recommendations, the RS records the choice to the browsed matrix; otherwise, the interaction between this user and the RS terminates; (3) all users visit the RS in a random sequence.

This process of the interactive recommender is described in Algorithm 2. The input includes three user-specified parameters , , and . It essentially has three steps for each user.

Step  1. Get items of recommendation through calling Algorithm 1. This step corresponds to Line 8 of the algorithm.

Step  2. If the user is not interested in recommendation items, the RS refuses to further recommendation. This step corresponds to Line 11 of the algorithm.

Step  3. If the user is interested in recommendation items, the RS records the user’s choice and updates the browsed matrix. This step corresponds to Line 13 of the algorithm.

Since the number of users is limited, the user-recommender interaction will eventually terminate. The interactive recall and diversity are then computed. This step corresponds to Lines 18-19 of the algorithm.

3.3. A Running Example

An example of the interest matrix is given in Table 2. There are four users and eight movies. The browse matrix is null in the initial stage. , , , and are assigned as 3, 0.25, 1, and 1, respectively, in the running example.

A running example of hybrid recommendation for the interactive scenario is illustrated in Figure 3. The users login on the RS in random order. We assume that a login sequence is .

Figure 3(a) depicts the interaction between and the recommender. Because she is the first login user with no neighbor, the movies are recommended to her based on the random algorithm. The interactive process consists of three rounds. We assume that the recommended list of the first round is . The are in her interests set, while the is not in. The is browsed by her because it is the first match. The corresponding element of the browse matrix is set from 0 to 1. The is added into her grey list. The recommended list of the second round is the . The is browsed by her and the are added into her grey list. The recommended list of the third round is the . The is browsed by her and the are added into her grey list. She will quit the RS because no movie can be recommended to her. Finally, she browsed the and her grey list is .

Figure 3(b) depicts the interaction between and the recommender. The interactive process consists of three rounds. The first round of recommendation is based on random algorithm. The recommended list of the first round is . The is browsed by her and is added into her grey list. Because and , the becomes her neighbor. The are recommended based on NN algorithm and the is recommended based on random one in the second round of recommendation. The recommended set of the second round is . The recommended list is . The is browsed by her and the are added into her grey list. Because no movie can be recommended based on NN algorithm, the third round of recommendation is based on random algorithm. The recommended list is . She will quit the RS in the third round because the recommended movies are not in her interest list.

Figure 3(c) depicts the interaction between and the recommender. The interactive process consists of six rounds. The first round of recommendation is based on random algorithm. The recommended list of the first round is . The is browsed by her and is added into her grey list. The recommendations of the second round are based on NN and random algorithms. The number of items recommended by the NN algorithm is . The random recommended number is . The becomes her neighbor and recommends the . The two movies and the random recommended constitute the list . The is browsed by her and is added into her grey list. The recommendations of the subsequent rounds are based on the random algorithm because her neighbors and cannot recommend new movies.

Figure 3(d) depicts the interaction between and the recommender. The interactive process only includes one round because the recommended list is not in her interest list.

When there is no active user in the RS, the browse matrix is shown in Figure 3(e). From the viewpoint of the running result, it is more beneficial to recommend if the user’s interest is more extensive. Finally, the interactive recall is . The number of the successfully recommended items is 6; therefore the interactive diversity is . The user-recommender interaction forms an active learning scenario because the sequences of the users’ login and choice affect subsequent recommendation.

4. Experiments

We design two kinds of experimental schemes. The first kind includes three experimental approaches to find appropriate parameters setting of our hybrid algorithm. The second kind includes two experimental approaches for comparison of random and hybrid algorithms. Each approach is repeated 10 times with different recommendation items due to random algorithm.

We try to answer the following questions through experimentation.(1)What is the appropriate proportion of random recommendations?(2)What is the reasonable threshold of NN recommendations?(3)What is the appropriate in NN recommendations?(4)How does influence the performance of the algorithm?(5)Does the hybrid algorithm outperform the random one?

4.1. Dataset

We experimented with a well-known MovieLens dataset which is widely used in recommender systems (see, e.g., [10, 20]). The database schema is as follows:(i)user (userID, age, gender, and occupation),(ii)movie (movieID, release-year, and genre),(iii)rates (userID, movieID).

We use the version with 943 users and 1,682 movies. The original rate relation contains the rating of movies with 5 scales, while we only consider whether a user has rated a movie. All users have watched at least one movie, and the dataset consists of approximately 100,000 movies ratings. But rating matrix is still spare because no one has watched more than 45 percent of the total movies, and only the 20 percent of users have watched more than 10 percent of movies.

4.2. Experimental Design

We know which movies the user watches through the original dataset. These movies are assumed as the interest matrix of all users in our RS. The browsed matrix of all users is empty in the initial stage. In the process of incremental recommendations, the new browsed items are recorded into the browsed matrix. Finally we compute the interactive recall and diversity based on the browsed and interest matrix.

We redefine the recall and diversity metrics for measuring the interactive recommendation performance. The ratio of browsed items with respect to items potentially interesting to her will be called the interactive recall.

Definition 5. The interactive recall of user on a set of items through an RS is

Now we study the interactive recall of the recommender.

Definition 6. The interactive recall of an RS is

Here we observe that the interactive recall is related to the user interest matrix.

Next we study the interactive diversity of the recommender. Let be the set of items browsed by user in the user-recommender interaction. The set of all items successfully recommended to at least one user is given by The interactive diversity is

Naturally, higher indicates more diverse recommendation since more items are recommended and browsed. Our goal is to maximize the interactive recall and diversity. In other words, we expect each user to browse as many items as possible and the recommender to discover as many user interests as possible.

We design two kinds of experiments. The first kind of experiments is to determine appropriate setting of the , , and to hybrid algorithm. The second kind of experiments is to compare the interactive recall and diversity of random and hybrid algorithms.

In the first kind of experiments, we assign . Let , , and . Six experiments are undertaken to answer the questions raised at the beginning of the section one by one. A parameter is allowed to make a change, and the other parameters are set to a fixed value in each experiment. Each experiment is repeated 10 times, and the average recall and diversity are computed.

4.3. Results

In Figure 4(a), we assign , and make change. When changes from 0 to 0.4, the recall basically keeps stable. When , the recall begins to decline rapidly.

The parameter settings of Figure 4(b) are the same as Figure 4(a). From the overall trend, the diversity increases with the increasing ratio of random recommendations.

In Figure 4(c), we assign , and make change. The overall trend of the recall is downward with the increase of . When changes from 1 to 20, the recall decreases rapidly. When , the recall decreases slowly.

The parameter settings of Figure 4(d) are the same as Figure 4(c). The diversity fluctuates between 0.6 and 0.85, and the overall trend remains consistent.

In Figure 4(e), we assign , and make change. When changes from 1 to 25, the recall is basically linear increase. When changes from 25 to 45, the recall increases slowly. When , the recall keeps stable.

The parameter settings of Figure 4(f) are the same as Figure 4(e). The diversity fluctuates between 0.6 and 0.8, and the overall trend remains consistent.

Based on the first kind of experiments, our hybrid algorithm has a better performance when the , , and are assigned as 0.25, 3, and 45, respectively.

In the second kind of experiments, we assign , , and and make and change.

In Figure 5(a), we assign . The overall trend of the recall is basically linear increase with the increase of total number. The recall of hybrid algorithm is better than the random one.

The parameter settings of Figure 5(b) are the same as Figure 5(a). When changes from 1 to 25, the overall trend of the diversity is basically linear increase with the increase of total number. When changes from 25 to 50, the overall trend of the diversity keeps stable. The diversity of hybrid algorithm is almost all the same as the random one.

In Figure 5(c), we assign . The overall trend of the recall is basically increased with the increase of revisit number.

The parameter settings of Figure 5(d) are the same as Figure 5(c). The overall trend of the diversity is basically increased with the increase of revisit number.

The recall of hybrid algorithm is better than the random one; however the diversity is converse.

In this section we briefly present some of the research literature related to user-recommender interaction, active learning, NN, hybrid filtering, and granular computing.

User-recommender interaction (URI) [21] is a framework and methodology for analyzing user tasks and recommender algorithms to generate useful recommendation lists. There are three pillars to URI: (1) the user-recommender interactive interface: receiving user request and getting one recommendation list from a recommender, (2) the recommender algorithm: generating one recommendation list based on user information, and (3) the recommender personality: the user’s perception of the recommender over a period of time.

Our interactive recommendation forms an active learning scenario. Active learning guides the acquisition of new knowledge suitable to update timely rated or browsed information [22]. The similar methods have been applied to the active learning [2325]. The most widely similar method is NN [2629]. The NN is a method for classifying objects based on closest training instances. In the user to user version, the similarity approaches typically compute the similarity based on item ratings by users. The item to item NN version computes the similarity between two items. The traditional metrics to compute similarity have the cosine, Pearson correlation, mean squared differences, or Euclidean distance [17, 30].

Our hybrid algorithm consists of random and NN ones. Hybrid filtering [9, 31] combines recommendation components of different types to achieve improved performance and reduce the cold-start problem. There are four main hybridization techniques: weighted, mixed, switching, and feature combination [9, 32]. Hybrid filtering is usually based on bioinspired or probabilistic methods [4] such as random algorithms [20], NN algorithm [33], and Bayesian networks [34, 35].

The and are two kinds of different granules selection. The is used to select different number of neighbors to participate in the NN recommendation. The denotes the number of items recommended to a user each time. We study the impact of different granules on the performance of our algorithm. Granular association rule [36, 37] is an intersection of relational data mining, granular computing, and recommender system. The description of information granules with different attribute-value pairs and different size embodies the essences of granular computing [37]. Granular computing [3840] is an emerging conceptual and computing paradigm of information processing. It delivers a cohesive framework supporting a formation of information granules and facilitating their processing and has recently been considerable development [37, 4143].

6. Conclusions

In this paper, we have proposed a hybrid recommender system for the interactive scenario. Through adjusting the recommended parameters and comparing the random and hybrid algorithms, we may draw the following conclusions: (1) the ratio of random recommendations has no great influence on the performance as long as it is not too big (e.g., not more than 0.25), (2) one should employ NN as early as possible, (3) the neighbors number should be big enough (e.g., 45), (4) the recall is nearly linear increase with respect to the number of recommendations in each round, and (5) the hybrid algorithm is better than the random one.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is in part supported by National Science Foundation of China under Grant nos. 61379089 and 61379049, Seedling Project of Sichuan Province in China under Grant no. 2014-056, Scientific Research Starting Project of SWPU no. 2014QHZ025, and the Natural Science Foundation of Department of Education of Sichuan Province under Grant no. 13ZA0136.