Abstract

Although personal and group recommendation systems have been quickly developed recently, challenges and limitations still exist. In particular, users constantly explore new items and change their preferences throughout time, which causes difficulties in building accurate user profiles and providing precise recommendation outcomes. In this context, this study addresses the time awareness of the user preferences and proposes a hybrid recommendation approach for both individual and group recommendations to better meet the user preference changes and thus improve the recommendation performance. The experimental results show that the proposed approach outperforms several baseline algorithms in terms of precision, recall, novelty, and diversity, in both personal and group recommendations. Moreover, it is clear that the recommendation performance can be largely improved by capturing the user preference changes in the study. These findings are beneficial for increasing the understanding of the user dynamic preference changes in building more precise user profiles and expanding the knowledge of developing more effective and efficient recommendation systems.

1. Introduction

The fast proliferation of online information increases the users’ difficulties in finding target information, services, and products on the Internet. Recommendation systems act as a filtering service to fight against the information overload [1], with the clear purpose of identifying precise positioning of the information targets, as well as offering efficient resource utilization [2]. Their main functionality is to provide the users with recommendations that are more in line with the user personal preferences [3]. Moreover, recommendation systems can also increase business revenue [4], improve business efficiency [5], and strengthen the users’ loyalty toward business [6]. Currently, recommendation systems have been widely used in a variety of application domains, such as entertainment, commerce, and social networks, and many popular web platforms, including YouTube, Amazon, Spotify, and Facebook, largely leverage recommendation systems and technologies in their business to increase business profits and effectiveness [4], and promote the user satisfaction [7] and loyalty [8].

Recommendation systems are increasingly drawing the attention of the practitioners and the academics. Evidence from prior research shows that in addition to relevance, novelty and diversity are also important factors that need to be addressed in assessing recommendation systems [9]. Indeed, novel and diverse recommendations could not only help the users find relevant information and services but also support the users in discovering new items [10]. Furthermore, the novelty and the diversity of the recommendation systems can cover the shortages of the recommendation systems. For example, the problem of long-tail items (e.g., less popular and newly added items) can be effectively addressed by diverse (e.g., with variant and wide-ranging features) recommendations [11]. Moreover, the influence of repeated recommended items on user satisfaction and business sales can be alleviated by considering the novelty of recommendations [12, 13]. Therefore, it is important to focus on relevance, novelty, and diversity in recommendation systems.

Recommendation systems can be generally categorized into personal recommendation systems and group recommendation systems according to their target range and the quantity of users [14]. Personal recommendation systems aim to provide a single user with relevant product or service recommendations, while group recommendation systems recommend items for a group of users [15]. The major differences between these two categories can be highlighted in terms of system design, user interaction, and business purposes. Regarding system design, personal recommendation systems usually use collaborative filtering and content-based methods to provide single users with information to aid them in information seeking [16], whereas group recommendation systems normally utilize aggregation techniques to offer recommendations for the groups of users [17]. For user interaction, personal recommendation systems involve explicit and implicit user-item interactions where each user interacts with items separately, whereas group recommendation systems involve interactions between groups of users and items where each user is represented as a part of a group. For business purposes, personal recommendation systems are mainly dedicated to support and fulfill the individual user’s goals, thereby increasing the profit, satisfaction, and loyalty of the individual users. At the same time, group recommendations are normally trying to assist and accomplish the group’s goals to increase the profit, satisfaction, and loyalty of the groups of users.

Although both personal and group recommendations have been quickly developed, challenges such as dynamic user preference changes, precise user profile establishment, data sparsity and recommendation diversity still exist. In particular, the issue of user preference changes over time has become one of the most challenging tasks because it can cause difficulties in building accurate user profiles within recommendation systems and has a considerable impact on recommendation performance. More importantly, user preference changes are complex and are usually grouped into long-term and short-term preferences [18, 19]. Long-term preferences can be more stable and change slowly over a certain period. Short-term preferences can be quickly influenced by user instant demands and recent interests. Therefore, to better understand the changes in the user preferences over time, build more accurate user profiles and improve the recommendation performance. Also, the time awareness of the user preferences for individual and group recommendations needs to be explored further.

Some studies examine the user preference changes (e.g., [20, 21]) and explain the role of long-term and short-term preferences in recommendations (e.g., [22, 23]), but there is a limited focus on the user preference dynamics. It can be argued that without drawing enough attention to the user preferences dynamics, the recommendation performance could be limited or dropped as recommendations that have been relevant in the past might not meet users’ preferences at present because the users have changed their preferences [24, 25]. Even those rare studies that investigate dynamics in the user preferences (e.g., [26]) do not offer a systematic understanding of the user preferences changing and lack the analysis of relevant effects on recommendations. More importantly, there is no study that particularly addresses time awareness in the recommendations for both individuals and groups of users. It may be arguable that this lack of understanding may hinder the development of effective and efficient recommendation systems. Hence, this study primarily investigates the user preference changes over time to fulfill its promise. More specifically, the following research questions are investigated: (1) What are the effects of the user preference changes on the individual user and the group recommendations? (2) How do user groups affect the precision, recall, novelty, and diversity of group recommendations? To carry out the research, an experimental study is conducted by proposing a time-aware hybrid approach to generate recommendations for individuals and groups of users. The proposed approach consists of collaborative filtering, a content-based method, and the method that aggregates personal recommendations in this study. Collaborative filtering is employed for candidate item selection. The content-based method is used for building the user profiles as well as providing top personal recommendations. The aggregation method for personal recommendations is used for group recommendations.

This study contributes to the recommendation systems’ research by developing a time-aware hybrid recommendation approach for both individual and group users in which the user preferences change over time and are importantly addressed to improve the recommendation performance. This approach will help developers increase their understanding of capturing the user dynamic preference changes in building more precise user profiles as well as strengthening the knowledge of the user profile modeling for personal and group recommendations. Our study can also help scholars focus on multiple metrics, including precision, recall, novelty, and diversity, to evaluate the recommendations’ outcomes, providing deeper insights into measuring the efficiency of the recommendation systems.

The rest of this paper is organized as follows. The section of related work describes a variety of approaches for personal and group recommendation, and the importance of the user preference change is addressed. In the research methodology section, we introduce a time-aware hybrid recommendation approach used in this study. The section of experimental study identifies evaluation metrics and describes the design of personal and group recommendations. Then, the detailed experimental results are described and discussed in the results and discussion section. Finally, our conclusions, implications, limitations, and future work are described in the conclusion section.

2.1. Personal Recommendations

Personal recommendation systems aim to provide a single user with relevant product or service recommendations. They support individual users in the decision-making process, and help to achieve users’ goals and fulfill their needs. A variety of recommendation techniques have been used for developing personal recommendation systems, such as collaborative filtering, content-based filtering, and knowledge-based and demographic recommendation systems. However, most of these techniques have challenges and limitations, including cold-start [27], data sparsity [28], and limited content analysis [29] problems. To overcome these problems and limitations, hybrid recommendation systems are popularly used. In recent years, hybrid recommendation systems that combine two or more recommendation techniques have mainly been used for personal recommendations. More importantly, hybrid personal recommendation systems could be further improved by exploring additional information, such as user-generated content and supplementary user information and context item information.

User-generated content, such as user reviews, item description tags, and ranking, can help recommendation systems better understand the user preferences and improve the recommendation performance. For example, Wang et al. [30] developed a hybrid collaborative filtering method that combines preliminary recommendations with the sentiment analysis of the user reviews to increase the accuracy of recommendations. Likewise, Qian et al. [31] proposed an emotion-aware recommendation system based on the hybrid information fusion. The results show that there is a significant improvement in the recommendation performance by using more user-generated information.

Furthermore, the performance of the hybrid personal recommendation systems can be promoted by acquiring supplementary user information and contextual item information [3234]. The common techniques that are used for acquiring this information include ontology and deep learning. Tarus et al. [32] used a hybrid recommendation approach that consists of ontology and sequential pattern mining techniques. The ontology is used to depict the knowledge about user interests and item features, which can help their proposed approach to alleviate the cold-start and data sparsity problems. Likewise, sequential pattern mining is used for identifying the user historical sequential learning patterns to increase the recommendation performance and improve the accuracy of predictions. Similarly, Kermany and Alizadeh [33] carried out an ontology-based study, developing a hybrid recommendation system that employs fuzzy multicriteria collaborative filtering and item-based ontological semantic filtering approaches. This study addresses users’ demographic information as external information sources to improve the performance of the recommendation system. In addition, Kim et al. [34] conducted a study that uses the convolutional neural network with probabilistic matrix factorization to capture the contextual information of the items. The findings confirm the effectiveness of using convolutional neural networks in capturing contextual information from item descriptions, showing the higher quality of recommendations.

2.2. Group Recommendations

A group recommendation system is another popular recommendation system that focuses on recommendations to the groups of users. To provide more precise group recommendations, the user model aggregations are important approaches that integrate the individual user preference profiles as the group profile to generate relevant recommendations [17, 35]. These approaches, such as the average aggregations and least misery aggregations, primarily focus on user profiles, leveraging user individual preferences to group preferences to generate group recommendations. However, simple user model aggregation methods for group recommendations could lead to lower performance because the users could have contradictory preferences and the number of users in a group could be large. To cope with contradictive preferences among users in a group, Guo et al. [36] developed a group recommendation approach in which the group recommendations process is transformed into a multicriteria decision-making process (MCDM). This method can better address the contradicting preferences in the group and alleviate the cold-start problem in the group recommendation. The concurrent work of Guo et al. [37] has also considered a similar approach based on the user model aggregations. More specifically, the authors explored the heterogeneity among users’ preferences and aggregated the predicted preference relations into the group profile by utilizing the Borda voting rule. The results prove that utilizing preference relations optimizes group profile modeling and improves the efficiency of the recommendations. In addition, to address a large number of users in a group, Seo et al. [35] introduced a user model aggregation method that takes into account deviations for group recommendations to improve the recommendation performance even in cases with a large number of users in a group. Overall, model aggregation approaches show their efficiency in recommendations for the users with contradictive preferences and large groups of users. However, the main problem of model aggregation methods lies in the difficulty of integrating them into hybrid recommendation systems, which could limit their performance in group recommendations.

Another approach, which is importantly used for group recommendations, is the aggregation of personal recommendations [38]. This approach primarily generates recommendations to each user of a group and then converts generated personal recommendations into group recommendations. For example, Villavicencio et al. [39] used the multilateral Monotonic Concession Protocol (MCP) to combine individual recommendations provided for each user in a group into group recommendations. The authors proposed an extension of a multiagent approach based on a negotiation technique that improves the quality of the group recommendations. More importantly, the personal recommendation aggregation approaches can be better integrated into hybrid recommendation systems to achieve desirable recommendation outcomes. For example, Kassak et al. [40] combined collaborative and content-based methods and converted the recommendations for individual users into group recommendations by utilizing the conjunctive aggregation function. By doing so, it can better address group conflict preferences and improve the quality of group recommendations. Similarly, Pessemier et al. [41] presented a hybrid recommendation system that combines individual recommendations into group recommendations using a two-step aggregation method. In the first step, the average without misery (AvgWM) method is used to generate a list of group recommendations. The concept of average without misery method is to find the optimal decision for the group without offending any participant with this decision In the second step, the users give feedback to the generated list and select their final favorite recommendations.

2.3. User Preferences Changes

The users change their preferences as time passes [20]. Rafailidis and Nanopoulos [42] pointed out that changes in the user preferences can vary at different rates. These changes have direct effects on the precision, novelty, and diversity of the recommendations [43, 44]. Therefore, the user preference changes need to be fully recognized to ensure the effectiveness of recommendations. Generally, the user preferences can be divided into long-term preferences and short-term preferences [18, 19, 22, 23]. The former characterizes the user general interests that are relatively stable or that change slowly over time, whereas the latter usually refers to the user temporal interests, and they can be easily influenced by a variety of factors, such as user instant demands, recent interests, and global mainstream trends in a short period of time [18].

Recently, the user long-term and short-term preferences have drawn attention from research to obtain more precise and accurate user preference profiles and improve the recommendation performance [18, 22, 23, 45]. For example, Tan and Liu [22] incorporated an attention mechanism into a recurrent neural network to capture the user preference changes and model the user long-term and short-term preferences. Similarly, Hu et al. [23] focused on user short-term preferences and developed a graph neural recommendation model that incorporates user recent activities with the attention mechanism on recurrent neural networks to explore the user short-term preferences. Furthermore, Yu et al. [18] extended the traditional recurrent neural network structure by addressing time-aware and content-aware controllers that integrate both short-term and long-term user preferences and achieve superior performance in terms of the AUC and the F1-score measures. Furthermore, Liu et al. [45] developed a hybrid attention mechanism of recurrent neural networks to capture users’ long-term preferences and reinforce short-term preferences. More precisely, the authors combine the item description and visual information that makes the recommendations more apparent and interpretable and accomplish better recall and NDCG measures. Overall, recent studies have proven that distinguishing user long-term and short-term preferences could lead to more precise user profiling and achieve better recommendation performance. However, it is vital to note that the users tend to change their preferences over time. For this point, the user long-term and short-term preference profiles could have limitations if user preference changes over time are not sufficiently considered.

A number of studies (e.g., [20, 21, 26]) have explored users’ preference changes over time to improve the user profiling and recommendation system performance. For example, Inuzuka et al. [21] investigated the user changes in preferences based on the user interaction with the recommendation systems, indicating that a better approach to address the user preference changes could largely improve recommendation system performance. Similarly, Rafailidis [20] focused on the pairwise correlations between the latest preferences and former preferences to better capture users’ changing preferences. Lin and Chen [26] addressed user preference changes over time by separating the differentiation of the recent and the early user and item data. This study proposes a probabilistic collaborative filtering model based on the hidden Markov models (HMMs) that can obtain changes in item properties and capture the changes in the states of user preferences. According to the evidence from the previous studies, the user preference change is the critical part of forming a user profile, which must be better understood. It can be argued that without sufficient attention to an awareness of the user preference changes, the recommendation systems may still face challenges in providing the users with the recommendation outcomes that are more in line with the user personal needs. Thus, it is important to consider time awareness in the recommendation systems.

3. Research Methodology

This study will focus on providing precise, novel, and diverse top recommendations for individuals and groups of users in a business recommendation domain (e.g., restaurants, local services, hotels and entertainment facilities recommendations). To conduct the study, a time-aware hybrid recommendation approach is proposed. The proposed approach includes five components. First, neural collaborative filtering is used to select candidate items for further top personal and group recommendations. To increase the accuracy of the candidate items, users’ information is exceeded by including users’ gender and location, such as a city or state. Second, users’ long-term preference profiles based on users’ interaction history with items and items’ categories are obtained. The decay function is applied for feature weight adjustment. Third, users’ short-term preference profiles that reflect users’ most recent preferences are obtained based on items’ features extracted from users’ reviews. Finally, top personal recommendations for the individual users are provided based on selected candidate items considering users’ long-term and short-term preference profiles. Top group recommendations for the different groups of users are provided by aggregating the top personal recommendations.

3.1. Candidate Items Selection

Collaborative filtering is used to select accurate candidate items for personal recommendations. Specifically, neural collaborative filtering proposed by He et al. [46] has been improved by incorporating users’ supplementary information, including users’ gender and location, to select more precise candidate items because users with similar demographic (e.g., gender and geographic location) features tend to have similar preferences [2]. Therefore, users’ gender and location are obtained in the user gender prediction and the user location precision modules.

The user gender prediction module takes users’ first names as an input and predicts the users’ gender. A long short-term memory (LSTM) recurrent neural network is used in the user gender prediction module. The ability to learn long-term dependencies makes LSTM advantageous for predicting the different sequences, such as text sequences or letter sequences. The LSTM neural network consists of repeating modules. Every repeating module includes the forget layer, the input layer, and the output layer.

In the forget layer, equation (1), the sigmoid function decides what information should be discarded from the cell. The output is the value between 0 and 1 for every number in the cell:where σ is the sigmoid nonlinearity, W is the weight parameter, h and x characterize the hidden output vector and the input feature vector, respectively, and b is the corresponding bias.

The input layer determines what information should be kept in the cell. It has two parts. In the first part, equation (2), a sigmoid function determines which values should be updated. In the second part, equation (3), a tanh function establishes a vector of new candidate values, :

The previous cell state is updated into the new cell state. The old state is multiplied by the output from the forget layer in equation (4), and new candidate values are added:

The output layer determines the output. The sigmoid function decides what segment of the cell state will be taken as the output. The function takes the cell state and gives values from −1 to 1, which decide the importance level (please see equation (5)). Finally, the tanh function is multiplied by the output of the sigmoid function in equation (6):

The user location is estimated in the user location prediction module. Specifically, the user location prediction module is developed based on interactions between the users and the items. To estimate the location of a particular user, the items that the user has interacted with are clustered by their location. The largest cluster is then selected as the user’s estimated location.

For candidate item selection, neural network collaborative filtering that uses supplementary user information from the user gender prediction and the user location prediction modules is proposed (Figure 1). The proposed neural network takes items’ and users’ information, including gender and estimated location, as an input in the input layer. This supplementary user information is helpful to enhance the accuracy of the predicted scores. Following the input layer, the embedding layer is used to represent users’ and items’ information as continuous vectors. Embedding helps to indicate syntactic and semantic characteristics of items’ and users’ information as well as capturing the relationship among them. To convert embedded users’ and items’ information into one dimension, the flattened layer is applied next. Subsequently, flattened users’ and items’ information is concentrated together in the concentrate layer. To cope with the problem of overfitting, a dropout regularization method is used in the following dropout layer [47]. The dropout layer is followed by several hidden layers with dropout regularization between them. In the hidden layers, the model learns interactions between users’ and items’ latent features. These layers contain a decreasing number of neurons, which is reduced by half neurons in each hidden layer. In this way, more abstractions can be learned from the data. Finally, the output layer is represented by a single fully connected neuron that predicts the scores that users would give to unknown items. The items with predicted scores above the 3.5 threshold are selected as candidate items for further processing.

3.2. User Long-Term Preferences Profile

Users’ long-term preferences are extracted from the items’ description attributes (e.g., business name, business category, business features) and items’ common characteristic attributes (e.g., opening hours, location, price). Although the number of description attributes and common characteristic attributes is relatively small, these attributes can precisely describe and categorize items and increase the effectiveness of building the user profiles.

To represent the users’ profiles, the vector space model (VSM)-based representation was employed as a user model representation method because it takes into account the importance of the different attributes in the user profile. In the vector space model representation, a user profile P is represented as an n-dimensional vector in which each dimension matches a distinct item feature and n is the total number of these features. The example of a user profile represented in a vector space model is illustrated inwhere f stands for the item feature and indicates the importance of that feature in the user profile.

In users’ long-term preferences profiles, users’ features are obtained based on users’ interactions with items. The initial weights of users’ features are calculated by the frequency of occurrence during the interaction with items. To illustrate the process of calculating the initial weights of a particular user profile, assume that user has interacted with two items, and . The first item contains the features , , whereas the second item contains the features , , and . Based on the user interaction, user interacted with four features from two items. The feature is represented in both the items that the user has interacted with, and its initial weight would be set as two. Whereas with features , , and , user only interacted once; therefore, their initial weights would be set as one. However, to more accurately reflect the user preferences in the user profile, the initial feature weight requires an adjustment based on the time.

To reflect the user preference changes in a user profile, the user’s feature weights decrease over time. For this purpose, the exponential decay function, equation (8), has been applied to adjust feature weights in the user’s long-term preferences profile based on the time when the user has interacted with a particular item. More specifically, the items from the user’s more recent interactions become more important and have a greater impact on the user’s long-term preferences profile than the items from the user’s earlier interactions:where controls the decay rate, is the time parameter, and the parameter determines whether particular users tends to change their preferences often or not (equation (9)). To determine the user likeness to change, the similarity between all the items that a user has interacted with is divided by the number of such items. The greater the parameter is, the less particular users tend to change their preferences:where n represents the number of all the items that the user interacted with.

In addition, as each user changes their preferences with different velocities, the parameter from equation (9) is calculated for each user, which affects the decay rate of equation (8).

3.3. User Short-Term Preferences Profile

The users’ short-term preferences are obtained from the user reviews to items. The user reviews are a type of user-generated content and are represented in a free text form. All users can freely express their opinions toward items through reviews, which may be helpful to understand the reasons behind whether the users like or dislike a particular item. Thus, item features can be extracted from the users’ reviews. To extract the item features, all the user reviews toward a particular item have been aggregated. These item features are extracted by using the n-gram language model algorithm [48]. The n-gram language model is a sequence of n words occurring in a given text corpus. Moreover, to extract more rich and precise features without indicating exceedingly rare ones, unigrams, bigrams, and trigrams have been considered, where n equals 1, 2 and 3, respectively. Finally, the extracted item features from the aggregated users’ reviews are stored in the item profiles.

Figure 2 shows the process of obtaining the user-short-term preference profile from the user reviews. As shown in the figure, to obtain the user short-term preferences profile, the user reviews of a particular item are compared with each extracted feature through the n-gram algorithm in the item profile. To compare the user reviews with the item features, the user reviews and the item features are mapped to vectors of real numbers. To be specific, representations from the Embeddings from Language Models (ELMo) are used to represent the item features and the user reviews in the vectors or the embeddings [49]. By doing so, syntax and semantics characteristics of items’ and users’ features are modelled to improve the quality of the embeddings. More importantly, contextual representation is considered by which the polysemous features can be distinguished in the embeddings from Language Models. Consequently, the similarity between item features and user reviews can be accurately calculated. After that, the most similar features are selected for the user short-term preferences profile.

In addition, a sliding-window algorithm is used to catch the most recent user preferences. The sliding-window algorithm usually considers a time-based sliding window or count-based sliding window [50]. The former relies on a user’s interactions with a service from the fixed latest time interval (latest day, month, year, etc.), whereas the latter considers a fixed number of the latest user interactions with a service (latest ten, twenty, fifty, etc. interactions). Considering that each user interacts with the system differently, the sliding-window algorithm is primarily used for capturing the latest number of user interactions in the current study. Please note that based on the empirical measurements, during the process of obtaining the user short-term preferences profile, only the latest twenty user interactions are taken into consideration to reduce time complexity, improve the speed of user modeling, and catch only the recent preferences. Moreover, to obtain more precise user short-term preference changes, the user short-term preferences profile is updated dynamically every time the user interacts with a new item.

3.4. Top Personal Recommendations

The selected candidates are processed by taking the user long-term and short-term preferences profile into the top personal recommendations. The similarity between the user long-term profile attributes and the item category attributes from the selected candidates is calculated. The cosine similarity measure is used to examine the similarity between the two vectors of an inner product space. It calculates the cosine of the two vector angles:where and are the components of vectors A and B, respectively. The output of similarity ranges from −1 to 1, where −1 indicates that two vectors are completely opposite, whereas 1 denotes that two vectors are completely similar to each other.

The item candidates that have the highest similarity with the user long-term preference profile attributes are selected and sorted in the descending order. After that, the processed items from the previous step are sorted based on a similarity score (equation (10)) between the user short-term preferences profile and the extracted item attributes from the user reviews. A final list of the top personal recommendations that considers short-term and long-term preferences is formed. The algorithm for the top personal recommendations is given as Algorithm 1 in Table 1.

As shown in Table 1, the list of the item candidates is primarily obtained for a selected user. The user short-term and long-term preference profiles are constructed. The recommendations data frame, rec-df, is created to store the recommendation list with the similarity scores. The similarities, long_sim and short_sim, between the item features and the user profiles are calculated and stored in the recommendations data frame. Then, the recommendations are sorted by the similarity with long-term preference profiles. The items with the greatest similarity are selected for further processing. Finally, the recommendations from the previous step are sorted by the similarity with short-term preference profiles. The final list is formed from the top items from the recommendations data.

3.5. Top Group Recommendations

The recommendations for the group of users are provided by aggregating the personal recommendations. The approach of forming the group recommendations from the personal recommendations is vital because when the recommendations are made for every user individually, it allows for catching the individual preference changes for each user in the group and dynamically reflects these changes in the final recommendations. Algorithm 2 in Table 2 is the pseudocode of the top recommendations for the groups of users.

It should be noted that the list of aggregated recommendations, aggregatedRec, is created first to aggregate all personal recommendations for every user in the group. The top personal recommendations for every user in the group are provided by using Algorithm 2. After that, the similarity matrix, all_users_similarity, between all items from aggregated personal recommendations and users in the group is created. Then, the user profiles for every user in the group are obtained. The user profiles are compared with all items from the aggregated personal recommendations, and their similarity is stored in the aggregated similarity list. Consequently, every user in the group and their similarity with every item are added to the similarity matrix all_users_similarity. The average similarity between the users and every item in the similarity matrix is then calculated. Finally, the items are sorted by the average similarity in the descending order, and the final top items are selected for the recommendations to the group of users. It should be mentioned that Algorithm 2 can provide top recommendations for any number of users in the group.

4. Experimental Study

In this section, an experimental study is conducted to evaluate the performance of our proposed time-aware hybrid approach for individual and group recommendations. This proposed approach is measured by providing the top 5, 10, and 20 recommendations for individuals as well as groups of users in the business sector (e.g., restaurants, local services, hotels and entertainment facilities recommendations).

4.1. Datasets

The Yelp dataset is used in this study. The dataset contains business data and user data [51]. Business data include precise business information, such as city, state, latitude and longitude, stars, review count, business attributes, business open hours, and categories. User data contain rich information, such as user friend mapping, user name, user ID, number of reviews, average given stars, review data, and registration date (please see the examples in Tables 3 and 4). Initially, the dataset contained 6,685,900 reviews of 192,609 businesses. However, to better capture the users’ preference changes at different time points, the dataset was further refined and data were reselected, retaining the most active users and yielding a final 313,261 user reviews of 78,138 businesses. This dataset is split into two parts: a training set that contains 70% of the data and a test set that covers 30% of the data.

In addition, to train the LSTM recurrent neural network in the user gender prediction module, the National Data that depict the frequency of individuals’ given names in the United States with an associated social security number are used in this study [52]. The data are based on social security records on March 3, 2019, and contain the records of given names for more than a hundred years. In total, the data include 98,399 unique names. The data cover the fields of name, gender, and the frequency of a name. This dataset is divided into the training, testing, and validation sets according to proportions of 60%, 20%, and 20%, respectively.

4.2. The User Changes Preferences Example

The users change their preferences at different rates. Here, an example is provided to illustrate how user preferences changes over a long period of time based on the user long-term preference profiles (from 2013 to 2019). Two users and one item were selected randomly from the Yelp dataset. Changes in the users’ preferences can be clearly identified by measuring the similarity between the users’ preferences and an item’s features throughout a period of time. The similarity values with high discrepancy indicate that the users tend to change their preferences more frequently and dramatically. The similarity between users’ long-term preference profiles and the selected item is measured by equation (11) throughout the selected period. Figure 3 shows the preference changes of two users toward the same item. It can be seen that a significant difference in preferences from 2013 to 2019 is found for both users. Moreover, the two users tend to change their preferences at different rates. More specifically, it seems that one user has more stable preferences over a certain period, whereas the preferences of another user change more frequently and dramatically. These findings further confirm that the rate of users’ preference changes differs, and it should be addressed in the recommendation process for the individuals and the groups of users.

4.3. Evaluation Metrics

To evaluate the performance of the proposed recommendation approach, a variety of evaluation metrics, including precision, recall, novelty, and diversity metrics, are used in the study. By focusing on these metrics, it can comparatively highlight the efficiency of the evaluation. All evaluation metrics are applied at a top k rank, where only the top k results in the recommendation list are considered. Precision and recall evaluation metrics are represented in equations (11) and (12), respectively:where k is the length of the recommendation list; in the experimental studies, k corresponds to 5, 10, and 20. U is the set of users in the set, is the precision at k for a given user u, is the recall at k for a given user u, is the number of the user’s target items that the user likes in the recommendation list, and is the total number of the user’s target items in the test set.

The novelty metric defines how unfamiliar and surprising recommended items are to a particular user [13] (see equation (13)). To measure the item’s novelty, its probability is defined as a function of the item’s rank for all the users. Therefore, the novelty for all the items in the recommendation list is defined as the average popularity rank:where R is the list of top recommendations provided for a given user, and n stands for the number of provided recommendations.

The diversity metric stands for the average dissimilarity score between every possible pair of recommended items for a particular user (please refer to equation (14)). It denotes the level of difference between the recommended items:where S is the similarity score between every possible pair of recommended items, and n represents the number of recommendations for a given user. The similarity score between two items is calculated by using the cosine similarity measure (as presented in equation (10)). A higher similarity score among recommended items indicates a low level of diversity.

4.4. Personal Recommendations Evaluation Design

To examine the effectiveness of our proposed approach to user preference changes, top personal recommendations are provided at two time points (T1 and T2). As long-term preferences tend to change slowly over time, the effects of the user preference changes could be more obvious over a long time interval. On that premise, the two-year time interval was chosen between T1 and T2, which represent the date of 11-11-2017 (T1) and the date of 11-11-2019 (T2), respectively, where T2 is the date on which the updated dataset was obtained. The recommendation performance is evaluated by precision, recall, novelty, and diversity. Moreover, the results of the top personal recommendations are compared with a set of baseline algorithms, including the K-nearest neighbors algorithm (K-NN) [53], K-means clustering algorithm [54], co-clustering [55], nonnegative matrix factorization (NMF) [56], and singular value decomposition (SVD) algorithm [57].

4.5. Group Recommendations Evaluation Design

To conduct the evaluation of our group recommendation approach, recommendations are provided for groups with different numbers of users, including small, medium, and large groups. Small, medium, and large groups contain 3, 6, and 12 users, respectively. For every group size, 100 groups are randomly generated from the dataset. In total, 300 groups containing 2100 users are used for group recommendation evaluation.

The proposed group recommendation approach is compared with different group recommendation approaches, including the average, the least misery, and the most pleasure approaches based on the neural collaborative filtering algorithm [46]. The neural collaborative filtering average approach (average-nnmf) considers the opinion of every user in the group equally and takes into account the average of predicted ratings for each user in the group for a particular item and employs it for the group predictions for that item. The neural collaborative filtering least misery approach (lm-nnmf) attempts to minimize the misery for the users in the group. The main idea is that the group is as satisfied with the predictions as the least satisfied user in the group. In this approach, the predictions for a group correspond to the minimum of the predicted rating of each user in the group for a particular item. Finally, neural collaborative filtering of the most pleasure approach (mp-nnmf) takes into consideration the items that one user in the group likes the most but does not take into account other user preferences in the group. The predictions of a group correspond to the maximum of the predicted rating of each user in the group for a particular item.

5. Results and Discussion

5.1. Results of Top Personal Recommendations

Figure 4 depicts the results of the top 5, 10, and 20 personal recommendations in terms of precision. Overall, the proposed approach has higher scores at both T1 and T2, which shows that our proposed approach has better performance in precision than the other algorithms. It implies that the proposed approach could identify and provide more relevant items for users. Furthermore, it can be seen that there is a difference in recommendation performance between T1 and T2. More specifically, the performance of the proposed approach at T1 reaches 0.8683, 0.8492, and 0.8331 at the top 5, 10, and 20 recommendations, respectively. Our proposed approach at T2 shows a better performance, reaching 0.8723, 0.8521, and 0.8340 for the top 5, 10 and, 20 recommendations, respectively. This difference may indicate that preferences in users’ profiles were constantly changing and, as a result, had different weights at T1 and T2. These changes at both the time points have been captured by our proposed approach. This implies that changes in the user preferences can moderately affect the recommendation performance, especially in precision. In addition, it is interesting to see that the performance of the proposed approach gradually declines from the top 5 to 20 recommendations throughout T1 and T2. A possible explanation may be that the increasing number of recommendations may have more irrelevant (or false positive) items, which in turn decreases the recommendation precision.

Moreover, the results of the recommendation performance regarding the recall are shown in Figure 5. As presented in the figure, among the measure approaches, the performance of our proposed approach at both the time points is placed on the top, showing a better recommendation performance. More specifically, the proposed approach has a higher proportion of relevant items in top 5, 10, and 20 recommendations. However, it is worth noting that the performance at T1 is very close to the performance at T2. A possible explanation is that the percentage of relevant (true-positive) candidate items remains stable at different time points. This may have occurred because the changes in the user preferences have small effects on recall performance. Surprisingly, the results indicate that the performance at T1 is better than the performance at T2. Such results are not consistent with the findings of the precision, showing that the recommendation at T2 has a better performance than T1. This may be attributed to a consequence of the precision-recall tradeoff, where the increases of one metric (precision or recall) can lead to the decreases of another, and vice versa. Our results also show that the performance in terms of the recall increases gradually from the top 5 to top 20 recommendations at both T1 and T2. This may suggest that a positive correlation exists between the proportion of relevant (or true positive) items and the number of recommendations.

Regarding the results of the recommendation performance in terms of the novelty (see Figure 6), overall, the results show that the proposed approach has a better performance than the other algorithms. However, it is worth noting that the performance of the proposed approach at T2 slightly underperforms the K-NN and co-clustering algorithms at the top 10 recommendations. This may be indicative of the user preferences that have stabilized at T2, which leads to a lower level of performance regarding novelty. Moreover, performance differences in aspects of novelty are found between T1 and T2. Such differences are reflected by the changes of user profiles and features’ weights, showing that the performance of novelty is greatly influenced by changes in user preferences.

Figure 7 presents the recommendation performance of the diversity at the top 5, 10, and 20 personal recommendations. As expected, the results show that the proposed approach has better performance in terms of the diversity among all measure approaches. It implies that the items recommended by the proposed approach are more diverse and dissimilar. Furthermore, it is notable that there are significant differences in the recommendation performance between T1 and T2. Specifically, the diversity of the proposed approach achieves 1.8038, 1.7823, and 1.7710 at the top 5, 10 ,and 20 recommendations at T1, respectively. The performance at T2 reaches 1.7938 at the top 5, 1.7743 at the top 10, and 1.755 at the top 20 recommendations, which is lower than the performance at T1. It can be explained that the recommendation performance of the diversity might be influenced by the users with the less preference changes at T2. Such performance differences further support our previous findings that the weights of the preferences in users’ profiles are distinct at T1 and T2. Moreover, such results show that the diversity metric can be more affected by user preference changes. In addition, it is interesting to note that the performance of the diversity decreases from the top 5 to the top 20 recommendations at both T1 and T2. Such a finding can be explained by considering that the more the items are recommended, the fewer the dissimilarities that can be found.

Furthermore, the average performance of each measurement approach in terms of the precision, recall, novelty, and diversity metrics is shown in Table 5. The results show that the scores for the proposed approach are higher than other approaches in each metric, indicating that the proposed approach has a better average performance (highlighted in bold in the table). Such results confirm that better performance in terms of precision, novelty, and diversity can increase overall performance. In addition, the average performance between T1 and T2 shows the differences in terms of precision, novelty, and diversity. This implies that the recommendation performance is affected by the time differently, which in turn notes the importance of users’ preference changes. These findings suggest that capturing users’ preference changes can not only make precision improvements but also enhance the novelty and the diversity of the recommendations. At this point, our proposed approach is useful for efficiently providing recommendations to the users who either continuously change their preferences or keep stable preferences.

5.2. Results of the Top Group Recommendations

The results of the group recommendations in terms of the precision among the small, medium, and large groups are presented in Figure 8. Overall, the results show that the proposed approach outperforms other approaches within all three types of group recommendations. It shows that more relevant items are recommended by the proposed approach than other approaches within the small, medium, and large groups of users. Regarding the performance of the proposed approach, the results show that a better performance is achieved by the medium group. The large group places next, and the small group performs worse. This may have occurred because the distribution of the preferences among the users of the medium groups is relatively low, whereas users in the small and the large groups have a high level of preference distribution. Such a low distribution in medium groups indicates that the users have similar preferences, and the proposed approach could better identify relevant items for groups and consequently achieve better precision performance. Moreover, it shows that the number of users in the groups has an impact on the precision performance within the three groups. This increasing number of users in a group may challenge the precision of recommendation. In addition, it is interesting to note that the precision of the top group recommendations remains stable from the top 5 to the top 20 recommendations among all the three types of groups. This indicates that the increasing number of recommendations does not affect the ratio of irrelevant items, which leads to stable performance of the precision.

Figure 9 depicts the recommendations performance in terms of the recall among the three groups of users. As shown in the figure, overall, the proposed approach achieves better recommendation performance than other approaches among the three groups. It seems that the proposed approach provides the higher percentage of relevant (true-positive) items among all the groups. However, it is interesting to see that in large groups, the proposed approach slightly underperforms the mp-nnmf approach at the top 10 recommendations. This may have occurred because the proposed approach inaccurately identified irrelevant items for the large groups of users at the top 10 recommendations due to a high level of preference distribution among users in the large group. Consequently, the rate of relevant (true-positive) items decreased, leading to the lower performance of the recall. Moreover, it can be noted that in the proposed approach, the results of the recall within the three groups are fairly close to each other. Such results indicate that the users in the different groups may have small effects on the recommendation performance of the recall. In addition, the results show that the performance of the recall within each group has a significant increase from the top 5 to the top 20 recommendations. This may imply that the increasing number of recommendations may have more relevant (or true positive) items, which increases the recommendation recall.

Figure 10 shows the performance results of the novelty at the top 5, 10, and 20 recommendations among the small, medium, and large groups. As expected, it seems clear that the proposed approach outperforms other approaches among all groups. Moreover, the results reveal that the novelty in small and medium group recommendations follows the same pattern in which the performance decreases gradually from the top 5 to the top 20 recommendations. Such a decrease can possibly be explained by the fact that the more items are recommended, the more high-ranked items are included in recommendations. Nonetheless, for the large groups, the novelty decreases steadily from the top 5 to the top 10 recommendations with a sudden increase from the top 10 to the top 20 recommendations. A possible explanation of such a fluctuation lies in the distribution of preferences among the users in the groups. This supports our previous findings and proves that large groups contain users with widely differing preferences that affect recommendation outcomes in novelty. Furthermore, the values of the results show a high discrepancy among small, medium, and large groups. This discrepancy can be caused by the size of the user groups. Thus, the number users in a group could significantly affect the performance of the novelty.

Furthermore, the performance results of the diversity at the top 5, 10, and 20 recommendations among the three groups are shown in Figure 11. It is clear that the proposed approach shows better performance than other approaches in small, medium, and large groups. In particular, the diversity of recommendations for the small groups reaches 1.095, 1.232, and 1.382 for the top 5, 10, and 20 recommendations, respectively. In the medium groups, it achieves 0.979 for the top 5, 1.086 for the top 10, and 1.204 for the top 20 recommendations. A similar increasing trend can be seen in the large group recommendations, which hit 0.931, 1.025, and 1.112 for the top 5, 10, and 20 recommendations, respectively. Moreover, the results show that the performance of the diversity increases from the top 5 to the top 20 recommendations among all the three types of groups. This can possibly be explained by the fact that the more items are recommended, the more dissimilarities are found among recommended items. However, a negative correlation between the number of users in the group and diversity performance is found in the results. It may be explained by the distribution of users’ preferences in the different groups. As there is an increase of users in the groups, more users may share interchangeable preferences. Accordingly, the similarity among recommended items rises and consequently influences the performance of diversity.

Finally, the average performance of precision, recall, novelty, and diversity in the small, medium, and large groups is presented in Table 6. The results are consistent with our previous findings showing that the performance of the proposed approach is better than other approaches in all evaluation metrics among the different sizes of group recommendations. Moreover, the results show that there are differences in performance among small, medium, and large groups. Specifically, the performance of recall shows small differences of results among small, medium, and large groups, which implies that users in groups only have a slight impact on the recommendation performance in recall. However, the differences in the results in precision, novelty, and diversity among all the three groups are relatively high, which implies that the users have a considerable impact on the recommendation performance. These findings show that considering the users in groups could enhance the overall recommendation performance and provide not only precise but also novel and diverse group recommendations.

6. Conclusion

The recommendation systems work as a filtering service to fight against information overload, which is increasingly drawing the attention of the practitioners and the academics. Evidence from previous studies indicates that there is a need to develop an efficient recommendation approach to offer precise positioning of information targets and facilitate resource utilization not only for individuals but also for the groups of users. In particular, there is a need to address time awareness, which can be more in line with user preference changes to enable the improvement of recommendation performance. Therefore, this study aims to address user preference changes through a certain period of time to develop a hybrid recommendation approach that provides recommendations for both the individuals and the groups of users. More specifically, the following research questions are investigated: (1) What are the effects of the user preferences changes on the individual user and group recommendations? and (2) How do user groups affect the precision, recall, novelty, and diversity of group recommendations? To answer these questions, our proposed approach integrates neural collaborative filtering with a content-based method for individual recommendations, and the personal recommendation aggregation method is employed for group recommendations.

The results are summarized in the following aspects. First, the overall results show that the proposed approach achieves better performance than other algorithms, including K-NN, K-means, co-clustering, NMF, and SVD in personal recommendations and average-nnmf average-nnmf, mp-nnmf in group recommendations. Such results further confirm the validation of our proposed approach for individual and group recommendations. Second, the differences in the recommendation performance in aspects of precision, recall, novelty, and diversity between T1 and T2 are found in the study. This may imply that users’ dynamic preference changes through time can be well captured in users’ profiles by our proposed approach. Moreover, the results indicate that the changes in user preferences over time have a great impact on recommendation performance. These findings provide evidence that addressing user preference changes over time improves user modeling and consequently increases recommendation performance. Third, the results of the group recommendation evaluation demonstrate that the performance of precision, recall, novelty, and diversity differs among the small, medium, and large groups of users. These findings reveal that the number of users in groups has a considerable impact on the efficiency and effectiveness of group recommendation performance.

This study contributes to recommendation systems research by developing a time-aware hybrid recommendation approach to offer recommendations for both the individual and the group users. Furthermore, investigating the link between the user changing preferences over time and the user profiles provides deep insights into building more effective and precise user profiles that could improve the recommendation performance. The proposed approach would be helpful for developers to increase their understanding of capturing the user dynamic preference changes in the user profiles as well as strengthening the knowledge of the user profile modeling for both the personal and the group recommendations. In addition, our study can also help scholars broaden their knowledge of recommendation evaluation by focusing on multiple metrics, including precision, recall, novelty, and diversity, thereby providing deeper insights into measuring the efficiency of the recommendation systems.

This study has some limitations. For example, the proposed approach only considers users’ positive feedback in the user modeling without addressing the negative feedback, which may limit the modeling user profiles and the recommendation performance. Further study can address both positive and negative feedback to obtain more comprehensive users’ preferences to improve the user profiles. Furthermore, the proposed approach is limited by neglecting the social relationships among the users in the group, which may affect group satisfaction and the quality of group recommendations [58]. Future work can extend the study of the relations among the users in the group, including friends, colleagues, relatives, or strangers. The results will be beneficial for optimizing and improving the group recommendations.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by research grants funded by the National Natural Science Foundation of China (grant no. 61771297).