Abstract

In the last decade, sentiment analysis, opinion mining, and subjectivity of microblogs in social media have attracted a great deal of attention of researchers. Movie recommendation systems are the tools, which provide valuable services to the users. The data available online are growing gradually because the online activities of users or viewers are increasing day by day. Because of this, big data, analytics, and computational issues have raised. Therefore, we have to improve recommendations services upon the traditional one to make the recommendation system significant and efficient. This article presents the solution for these issues by producing the significant and efficient recommendation services using multivariates (ratings, votes, Twitter likes, and reviews) of movies from multiple external resources which are fetched by the web bot and managed by the Apache Hadoop framework in a distributed manner. Reviews are analyzed by a deep semantic analyzer based on the recurrent neural network (RNN/LSTM attention) with user movie attention (UMA) to produce the emotion. The proposed recommender evaluates multivariates and produces a more significant movie recommendation list according to the taste of the user on a mobile app in an efficient way.

1. Introduction

“Recommendation systems” are services that use Artificial Intelligence (AI) and Natural Language Processing (NLP) techniques to provide the empirical solutions of the recommendations for various application frameworks and services [1]. Recommendation systems enables mobile apps and web applications to make the perception intelligently about the selection of different items, movies [2], hotels [3], food [4], tourism [5], books [6], TV shows [7], YouTube videos [8], health [9], etc. Community trends polarize towards music, movies, or videos. For music or movies or videos, a huge amount of stream is available online, but which one of them will be watched is still a rising question. Music or movie recommendation systems still have challenges like the playlist, magnitude, security, privacy, recommendation, and session. Therefore, MRSs become a domain of music information retrieval (MIR) [1013]. Now, the society has changed, and community trends highly depend on mobile app usage. Several products are enriched by the usage of a mobile app. So mobile app recommendation systems are essential for suitable selection of recommended items [1416]. Most of the recommender systems are univariate and use ratings and reviews or tweets [17], and other few are bivariate (sentiment score and likes) [1820]. This work is state of the art and uses the multivariate matrix, which makes the decision using a dynamic approach for suggesting the movie according to the relative taste of the users. The term “multivariate” means involving many variables like a qualitative variable (semantic score) and quantitative variables (Twitter likes, rating, and votes) of movies from three movie sites for significant recommendation [21]. Our work is on extremity grouping of movie reviews, where an opinionated report is labeled with semantic emotions of the microblog text or reviews and emotions [22] using a semantic parser based on the recurrent neural network (RNN/LSTM) [23, 24]. A drawback is that change of a user’s review about a movie may affect the user’s preference. The nature of reviews influenced by the choice of words uses multilingual dictionaries. Some recommendation systems use linked movie databases, including Trovacinema, Google Places, and Netflix, and Wikipedia provides linked data and ontologies for descriptions about the movie [2527]. Using the shallow machine learning models for solving the NLP problems is handcrafted and time-consuming. Nowadays, word embedding, neural-based models achieve success and popularity by producing a better result as compared to traditional machine learning logistic regression, SVM, and KNN.

Artificial neural networks are the mathematical models that are inspired by human neural networks. They have three simple layers: input, output, and hidden layers, or sometimes only two layers: input and output layers. The input layer is connected to the hidden layer via a lean weight. The hidden layer output combines via the activation function . In the ANN, like the biological neural network, neurons are the nodes, while synopses are the edges. Each artificial neuron has an activation function in the ANN. There are several activation functions like sigmoid which ranges from 0 to 1, hyperbolic function which ranges from −1 to 1, and softmax function whose output in categorical distribution and ReLu function is a feedforward neural network. The ANN is not an algorithm; it is a framework for several machine learning algorithms to solve a complex work. Therefore, we can say that it is a collection of neurons or networks of neurons (https://en.wikipedia.org/wiki/Artificial_neural_network). The recurrent neural network (RNN/LSTM) processes the sequence semantically, which is the basic structure of deep neural networks. Several NLP tasks are performed by RNNs/LSTM attention. In this work, we used the hierarchical neural network (HNN) based on LSTM attention, which impaled the global user and movie information via word and sentence-level attention for document representation. The user’s reviews and movie features at the word and sentence level are taken for semantic analysis of reviews, which play a major role in the process of true recommendations. Global user information represents the personal behavior and the movie feature represents a movie genre or a movie profile or linked data which are useful for semantic extraction of movie reviews [28]. In natural language (word sequence), each word or sentence is related to another one and requires to be understood semantically. A huge amount of data are available online on web contents (ratings, reviews, likes, votes, smiley, images, and stars) that can be fetched by a web bot or web agent or crawler, which are all same terms used interchangeably. Web content (ratings, reviews, likes, votes, smiley, images, and stars) is useful for recommendation services. These contents are evaluated and make the perception about users, and items make the recommendation for others [29, 30]. The hot issues of big data like computational complexity are managed by using Map-Reduce and Apache Mahout in NoSQL [31, 32] distributed environment which reduce computation complexity by clustering and horizontal scaling instead of empowered single machine [33]. Because user frequency and data volume gradually increase, it is difficult to manage these huge data by a single machine. Sparsity can be reduced by factorization [34]. Movie recommendation systems provide services to users using content-based filtering algorithms [35], collaborative filtering [36], and some combined forms to make a hybrid filtering algorithm [37]. We used implicate rating to handle the cold start problem [38], an implication managed by the server The multivariate movie recommender provides the services to users to watch the movies according to their profile or history (previously watched or rated). Therefore, there is a need to improve recommendation systems for significant recommendation services. We developed a pilot version for these problems, which consists of a mobile app, a web scraper, and a multivariate recommender to provide the significant services for movie recommendation in an efficient way.

This work is arranged as follows: related works are discussed in Section 2, the recurrent multivariate movie recommendation system model is explained in Section 3, recurrent multivariate movie recommendation system implementation is given in Section 4, experiments and results are discussed in Section 5, and evaluation of the system is done in Section 6. The conclusion of this paper is presented in Section 7 and future work with more parameters in Section 8.

2. Literature Review

Sentiment analysis deals with the user’s comments, reviews, likeness, ratings, etc. to retrieve the sentiment and opinions of users. The microblog text sentiment analysis is based on the NLP methodology to retrieve suitable YouTube videos and movies and campaigns for smoking cessation, pharmacovigilance, politics of elections, advertisement of pizza, journalistic inquiry, and influenza prevention for public health [3945]. The CNN and RNN are two major categories of deep neural networks (DNNs). Sequential and hierarchal structures deal with the RNN and CNN, respectively. Both the CNN and RNN can be supervised, semisupervised, and unsupervised. The deep learning algorithm also involves in propagation and weight update activities. RNNs are based on multiple layers: input, hidden, and output layers, while CNNs have input, hidden, and pooling layers. The CNN is efficient for pattern recognition in hierarchal data classification. However, the RNN deals with linear data to be semantically analyzed and classified in NLP; in the CNN, the window size is limited, so the RNN is very useful if reviews from the microblog are very large [46, 47]. Recommendation frameworks were presented as agents of the second class, being characterized as frameworks that “… enable individuals to settle on decisions dependent on the conclusions of other individuals.” [48]. Early data-sharing frameworks had a place with the primary class and depended on text-based classification or separation, which works by choosing important things as per many literary catchphrases [49]. Recommender frameworks propose “things important to clients dependent on their unequivocal and verifiable inclinations, the inclinations of different clients, and client and thing traits.” [50]. The recommendation system is finding the right product according to the taste of the customer by filtering the fact through the likeness value [51]. Suggestions utilize the assessments of a community of clients to help people in that community all the more adequately distinguish the content of enthusiasm from a possibly overpowering set of decisions [52]. Recommendation by demographics which groups the users as per the traits of their personnel file, besides, creates proposals dependent on classes of the statistic. A premature precedent is a generalization-based Grundy system, which has been made to bolster book searching in a library [53]. The recommendation is reliant on the computation of utility of each item for a user’ utility capacity (http://www.eqo.info). Recommendation by knowledge proposes things dependent on legitimate inductions about a user’s inclinations. A learning portrayal or a rule about how a thing meets a specific client requirement is important (http://www.findme.com.ph). By applying preference-based collaborative filtering, a recommender system intend to foresee majority of estimation of likeness, where a few users may provide inconspicuous views as well [54]. There are two types of architecture for the recommendation systems: One is centralized and situated at a specific location [55]. Another one is geographically distributed and situated at different locations [56]. There are three types of recommendation modes by which the system will be initiated: The first one is the push mode in which suggestions are pushed to the user while he is not associating with the system by email [48]. The second one is the pull mode in which suggestions are generated but are displayed to the user just when he permits or unequivocally asks for it [57]. Push and pull modes are the active mode in which the recommender is initiated. The third one is the passive mode in which suggestions are generated as a feature of the customary framework activity, for instance, an item suggestion with reference to a user’s preference [58]. A user’s preference of items can be determined by using the linear adaptive function multiattribute utility theory (MAUT) [59]. Cosine similarity determined by cosine vector comparability is one of the well-known measurements of insight since it notionally considers just the edge of two vectors without the size. The collaboration between the search item and the other item that is rated by users can be measured by the angle of their vectors; if the angle is 90°, then the value of cosine similarity is zero, which means the item is irrelevant. If the angle between cosine vectors is nearly about zero, then the value of cosine similarity is one, which means the product is relevant (https://en.wikipedia.org/wiki/Cosine_similarity) [60]. There are three major classes of collaborative filtering: (1) collaborative filtering (CF) in which users and items’ profile data are required to make a decision for recommendation [61], (2) content-based filtering on the description of the content of items and user preference information (explicate or implicate) for recommendation [62], and (3) combining various filtering techniques to handle scalability, sparsity, and cold start problem and other big data issues of the recommendation system to get better outcomes [63].

3. Multivariate Movie Recommendation Model

The multivariate approach is (see Figure 1) based on three modules: mobile app, multivariate recommender, and web scraper. Users can get the recommendation services through a mobile application. The mobile app module provides the information such as the user’s query, profile, and history to the recommender module. The recommendation is made for both registered and unregistered users of the mobile app. The recommendation module is based on the deep learning NLP module and computation module. The NLP module preprocesses the fetched qualitative data (user’s reviews) of microblogs using a tokenizer, stemmer, and POStagger and then semantically analyzes the reviews and extracts the semantic emotions about movies. Semantic parser work is based on the deep machine learning algorithm recurrent neural network (RNN/LSTM attention) with user movie attention (UMA). Semantic emotion is classified into five major classes: (i) Highly Favorable, (ii) Favorable, (iii) Averagely Favorable, (iv) Unfavorable, and (v) Highly Unfavorable, on the bases of their relative semantic scores. While the computation module normalized the quantitative data (Twitter likes, votes, and ratings), normalized scores and semantic emotional scores were evaluated to generate the recommended movie list. The recommended movie list consists of five medals and their popularity such as Platinum: “Highly Popular,” Gold: “Popular,” Silver: “Averagely Popular,” Bronze: “Unpopular,” and Copper: “Highly Unpopular.” The recommended movie list is generated according to users’ taste and preference. A web scraper fetched data (reviews, Twitter likes, votes, and ratings) from external data source sites (CinemaBlend, Moviefone, Rotten Tomatoes, and Twitter) and stored them in the NoSQL database for computation. Users’ feedback about a movie and app is useful for generating the recommended list and evaluation of system reliability.

3.1. NLP Module

NLP has the capability to understand natural language. Users share their opinions and reviews from the microblog that help in making a decision. Positivity, negativity, and neutrality are extracted by opinion mining, whereas emotions are extracted by semantic analysis. In our work, the NLP module determines the semantic emotion of the movie’s reviews by the LSTM-attention machine learning algorithm. This semantics is one of the parameters in multivariates used to make a recommendation. This methodology for semantics is depicted as follows:(i)The module fetches the reviews from microblogs related to movies such as CinemaBlend, Moviefone, and Rotten Tomatoes(ii)The module preprocesses the microblog text or reviews using a sentence splitter, tokenizer, and stemmer/lemmatizer(iii)The module determines the sense of the word to strength the sentiment using SenticNet(iv)Semantic parsing based on attention is done to construct a parse tree to identify the syntactic tree as the emotion of the sentence(v)RNN/LSTM-user movie attention (UMA) machine learning algorithm is used to classify the reviews

3.2. Preprocessing

It is estimated that more than 80% of data are unstructured and not in an organized manner. Preprocessing of text is cleaning or normalization of text/reviews. Stemming or lemmatization and tokenization are done to reduce the sparsity and shrink the feature space. Semantic analysis has to face some challenges such as short text, misspelling, grammatical mistake, slang, unusual terms, tags, white spaces, noise, and emoji. Text is a sequence of words, while word is a meaningful sequence of characters. However, the question is how to find out the boundaries of words. Words are identified by spaces or punctuation in English. However, a compound word is a set of words which have no spaces in German, for example, (“childhood memories description of an unforgettable event”) ⟶ (“Kindheitserinnerungen Beschreibung eines unvergesslichen Ereignisses”), while there are no spaces at all in Japanese like this (“childhoodmemoriesdescriptionofanunforgettableevent”).

3.2.1. Tokenization

The process of splitting the text stream into units is called tokenization. Units refer to tokens. For example, “This movie is so riddled” is a character string which is tokenized as [This] [movie] [is] [so] [riddled]. Splitting the input sequence into tokens has some problems. Splitting by white space has a problem that different tokens are tokenized into similar words, while the same words may have similar meanings (https://NLTK.Tokenize.WhiteSpaceTokenizer). Splitting by punctuation in which some punctuation are not meaningful is like “An apostrophe problem” (https://NLTK.Tokenize.WordPunctTokenizer). Splitting comes up with the set of rules that generate a more meaning full result (https://NLTK.Tokenize.TreeBankWordTokenizer).

3.2.2. Stemming (Lemmatization)

The stemmer stemmed the words like the Porter stemmer, which stemmed the English words “looked” as “look” with a morphological production rule, for example, [(“SSES ⟶ SS”): (“Caresses ⟶ caress”)], [(“IES ⟶ I”): (“Ponies ⟶ Poni”)], [(“SS ⟶ SS”): (“Caress ⟶ Caress”)], and [(“S ⟶ S”): (“Cats ⟶ Cat”)], but due to stemming of nonwords, the same plural word can be stemmed to singular and irregular forms. These are produced like (Wolves ⟶ wolv), (Feet ⟶ Feet). The WordNet database is looked up for lemmas to solve this type of problem. It solves some specific problems but not all, like (Wolves ⟶ wolf) and (Feet ⟶ Foot) (https://NLTK.Stem.WordNetlemmatizer).

3.2.3. POS-Tag Generation

POS tags are determined for all the tokens by Treebank POStagger. Treebank Project 1 represents 36 POS tags (http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html). For example, the POStag of string “Unwatchable I made it through 20 minutes I think” is [Unwatchable/VB] [I/PRP] [made/VBD] [it/PRP] [through/IN] [20/CD] [minutes/NNS] [I/PRP] [think/VBP].

3.2.4. Word Sense Disambiguation (WSD)

WSD is the issue of deciding the “sense” of a word. A lexicon controls a word and its conceivable faculties. Bar-Hillel, 1960, presented the example [“Little John was looking for his toy box. Finally, he found it. The box was in the pen. John was very happy.”]. In the previous string, a word “pen” has different senses according to WordNet. “Pen” word defines an “ink flow from a point to write”; here, pen is defined as an “arena of cattle” and as a “bird’s family.” In the assessment of the movie’s reviews, SenticNet is utilized to indicate their degrees of polarity, antagonism, and impartiality. The SenticNet score of the terms and its recurrence are determined to get the general supposition of the reviews (https://sentic.net) [64].

3.3. Parsing

In NLP, parsing is the process of determining the structure of a sentence by analyzing its essential words based on an underlying syntax. The Stanford parser is used to construct the parse tree that determines the syntactic structure relative to grammar (language). Parsing can refer to various things. Shallow parsing or chunking is the process of grouping the words into noun phrases (NP). Stuff can also be grouped into VP (verb phrases) and PP (prepositional phrases) using grammar like (S ⟶ NP│VP), (NP ⟶ DetNoun), (NP ⟶ ProperNoun), and (VP ⟶ Verb│NP). In contrast, dependency parsing determines the dependencies between the words and their type. For example, spaCy + displaCy for parsing and rendering is used to produce a more semantic result.

3.3.1. RNN/LSTM

Neural networks are represented by RNN/LSTM cells [65]. Typically, in Birdseye, RNN/LSTM is a chain of several copies of the same static network, as shown in Figure 2. From input, the sequence of copies of networks is working in a single timestep. In addition, networks are linked with each other via their hidden states h. So we can say that every copy network has its own inputs as the copy network is unfolded or unrolled. Let the sequence be represented as x1, x2, x3xn and each timestep be represented as xt ∈ x1xn. At timestep t, ht is a hidden layer and f is used to calculate the hidden state: ht = f(ht−1, xt). A word is represented by a timestep in the long sequence. For example, the given string is represented as a sequence in the mathematical form: “it is a good movie” ⟶ [“it,” “is,” “a,” “good,” “movie”]. And the timestep (t = 0, 1, 2, ….) for the string “it” is represented as x0, “is” as x1, “a” as x2, “good” as x3, and “movie” as x4. If t = 1, then xt = “is” ⟶ “current timestep to event” and xt−1 = “it” ⟶ “previous time stamp to event”:

At the input gate, the decision on which information should be remembered or rid of is made by the sigmoid function . It produces a 0 or 1 value: 0 means forget, while 1 means remember in the cell state. Sigmoid function at the input gate takes a decision on which value should be updated, and the new candidate value information is represented by function . Output gate sigmoid function decides which part of information should be produced, and then tanh function produces the value between 1 and −1.

The sequential semantic information is preserved in the recurrent neural network’s hidden states. In the hidden state (ht), the semantic information of the input sequence is preserved. When a new input is experienced and again delivered to be the subsequent input, then semantic information is altered. Passing the information from one to another network helping to find out the correlation among the words from the sequence is represented as a long-term dependency.

3.3.2. LSTM-Based Sequence Labeling

Predicates from a given input sequence are marked, and the label arguments corresponding to every predicate are identified. For example, in the given sentence “I watched the movie,” the predicate (watched) is marked, and labels corresponding to the predicate are “I,” “the,” and “movie” as an agent, null, and theme, respectively. Multiple predicates may present in a sentence, and different labels may be marked to the same word for every predicate. Concatenating pretrained ones (Word2vec) generates vectors of every word. The 1-bit flag represents the predicate in the specific training unit to confirm that the network deals with every predicate separately and serves it into the LSTM layer to the word context. With the predicate, any one word is labeled to take the dot product of its hidden state. A softmax function is applied over it. The probability of a sentence is calculated as follows:Here, the role label r is calculated by the weight matrix parameter using ReLU function and predicate lemma and the role depicted by taking the dot product of vectors to embedding.

3.3.3. Neural Sentiment Classification (NSC)

Document-level sentiment classification is measured by neural sentiment classification (NSC) based on hierarchical LSTM attention with user movie attention (UMA) (see Figure 3) that is represented by the user’s global information and movie features [28]. Let a review with sentences, each sentence of a particular review , a user , and a movie review corpus (users and their movie set). Moreover, is the length of the -th sentence, while consists of words as . Predicting the semantic rating of documents is done according to their text information. Firstly, in word-level low-dimensional semantic space, each word is mapped to its embedding in a sentence. Every step has a given input word , the current cell state , and the hidden state that may be updated with the preceding cell state . Then, the hidden state is represented. The document representation architecture is presented as follows:

Sigmoid activation function and gate activation functions are represented as and , respectively, while elementwise multiplication is represented as . Training parameters needed for training are represented as . The feed hidden states are represented to a mediocre pooling layer to acquire the representation of the sentence. Sentences are embedded at the sentence level into the LSTM; after that, document representation is acquired via a mediocre pooling layer in a similar way as follows:Here, training parameters needed for training are represented as . The feed hidden states are represented to a mediocre pooling layer to acquire the document representation.

3.3.4. User Movie Attention (UMA)

At various levels, a necessary component is extracted by using user movie attention (UMA) for sentiment classification. UMA is applied at the word level to construct a sentence and sentence level to generate a document. Obviously, sentence meaning may not be represented by all words for several users and movies. In spite of feeding hidden states at the word level to an average pooling layer, user movie attention (UMA) is used to extract user/movie relative words, which are essential to sentence meaning. Informative words are aggregated to produce the representation of the sentence. Formally, weighted hidden states generate the enhanced sentence as follows:

Importance of the word is measured by for the current user and movie. Each user and movie are embedded continuous and real-valued vectors and , while user and movie embedding is represented as and dimensions, respectively. Moreover, for every hidden state, the attention weight is presented as follows:

For the sentence level,

Importance of words for sentence representation as well as document representation is presented by score function as follows:

For the sentence level,where is a weight vector and represents its transpose, while , , and are weight matrices. Meaning of every document varies for different users and movies by the sentence, which provides the hints. So in the sentence level, usage of attention with the user and movie vector at the word level to select informative sentences to generate document representation is presented as follows:

In the sentence level, the weight of the hidden state is measured similar to word attention. The higher level representation of document is generated by hierarchical extraction from words and sentences in the document. So, for sentiment classification of the document, it is used as features. tanh activation function is used at the nonlinear layer for current document representation in the target space of classes:

tanh activation function is used at an absolute layer to get sentiment distribution of the document:

Sentiment classes and prediction probability of sentiment class are represented as and , respectively. During the training, loss function for optimization is measured by error cross-entropy between the distribution of Gold sentiment and distribution of our model sentiment as follows:Here, Gold probability of sentiment class and training document are represented as and , respectively, while reality-based truth is one and others are zero.

Some nomenclatures used in our mathematical model are presented in Table 1.

Table 2 presents the emotion class.

3.4. Computation and Classification

Sentiment analysis determines the emotions of reviews. Firstly, the aggregated sentiment score of each document from each site for the j-th movie is computed. Then, the qualitative score and then aggregated quality scores for Twitter likes are computed to get the final score for the recommendation of movies and generate the popularity class relative to the final score:where , and

This mathematical formulation is used to determine the final popularity score using the multivariate model. The emotional value is stretched to 10 scales, by which the popularity status is determined. Every movie is labeled with a medal according to the popularity score, and the algorithm that identifies the medal by using fuzzy logic on behalf of the popularity score to find the popularity of the movies is depicted as follows:(1)IF multivariate value ≥08, THEN: Platinum: “Highly Popular”(2)ELSE IF multivariate value ≥06, THEN: Gold: “Popular”(3)ELSE IF multivariate value ≥04, THEN: Silver: “Average Popular”(4)ELSE IF multivariate value ≥02, THEN: Bronze: “Unpopular”(5)ELSE Copper: “Highly Unpopular”

The ranges of the popularity scores and their respective medals and degree of popularity are given in Table 3.

The category represented by a movie genre to classify the movie according to its features, movie recommendation services suggests top 10 popular movies with their category according to the user request and profile history.

4. Multivariate Movie Recommendation System Implementation

4.1. System Component Interaction

User android application is front end of the system (see Figure 4) by which users can get the web services from the system, and back end is the movie recommendation system in the NoSQL environment with Apache Mahout and Hadoop, which provide the web services to the users as well as a web scraper by which the system fetched the data. Web scraper fetched the data from the external data source on the bases of matching lexicons of the query and movie content.

4.2. NoSQL Environment Implementation
4.2.1. Hadoop Architecture

It is a framework with four fundamental components: (1) HDFS splits the file into many small files and stores them on three servers for fault tolerance constraints as replicas in a distributed file system manner. (2) Map Reduce programming standard is for handling and manipulating big data. (3) Common/Core holds the reference library and services to backing up Hadoop. (4) YARN performs management, computation, and scheduling of resources and tasks.

4.2.2. Apache Mahout

Implementation of collaborative filtering, clustering, and classification is done by Apache Mahout. In the NoSQL environment, Apache Mahout interfaces implement the Hadoop framework and evaluate the performance similarities and neighborhood measures. A multivariate web scraper is implemented and big data are generated.

4.3. Web Scraper

Our web scraper is a scripting program, which surfs the W3, fetches data from different movie websites to extracts the reviews, votes, ratings, and Twitter likes, and stores them in the repository. In addition, it manages and handles scrape data in a NoSQL environment using Hadoop and Apache Mahout. The web scraper (web bot) receives the URLs and matches them with keywords (Meta tags) of the web page. If the keywords are matched, then the web pages are downloaded; otherwise, the irrelevant pages are discarded.

4.4. NLP Tools

Stanford CoreNLP technology tools are used to process the natural language like English. They give the words, relative parts of speech, and identification of sentiments. The Stanford CoreNLP framework is the integration of many of Stanford’s NLP tools, like POStagger, parser, sentiment analyzer, named “entity recognizer,” and pattern learning and information extracting tools from https://stanfordnlp.github.io/CoreNLP.

4.5. Mobile Application Usage
4.5.1. Unregistered Users

Unregistered users can request the movie by the search query to our recommendation system, and the system will respond to that query by content filtering to extract the features or content of a movie from the query. Collaboration is done between the user request and system-generated movies. The system provides the watched window to unregistered users for watching the recommended movies. Unregistered users give feedback by their likeness, and the system uses the feedback for accurate measurement.

4.5.2. Registered Users

If unregistered users sign up, then they can sign in and maintain their profile or history. For registered users, collaborations may be done on the bases of both the query and the history. Registered users may provide feedback to the system by their likeness, and the system uses this feedback for collaborative filtering between liked movies or their movie history and system movies for recommendations of the movie of their choices as well as accurate measurement of the multivariate movie recommendation system. The application also provides a watch window for registered users to watch the recommended movie.

4.6. Cold Start Problem Handling

Collaborative filtering (CF) is done for movie recommendation for registered and unregistered users; but in two cases, the problem may occur:(i)Case 1: if the registered users request the movie, the system collaborates the requested movie with the system movie and recommends the movies on behalf of user history. Here, one problem arises: if the newly registered users request the movies, then the system recommends the movies according to movies mostly liked by others to solve the cold start problem of newly registered users.(ii)Case 2: if a new movie arrives for registered or unregistered request, then the system recommends the movies according to the collaboration of new movie trailers, which were mostly liked to solve the cold start problem of newly released movies.

4.7. Similarity Measurement

We use cosine similarity in which there are two vectors for measuring the angle value for similarity manipulation. A smaller angle degree is directly proportional to larger similarity, and vice versa, as shown in Figure 5. It is also known as vector-based similarity. Movie document and search query document correlation is computed where q is the search query document and d is the movie document. The similarity can be calculated by the following equation:

5. Experiments and Results

The procedure followed by data preprocessing is NLP procedures applied for sentiment analysis on fetched data, and then the sentiment score is computed by using SenticNet and obtained results are presented as follows.

Table 4 presents the identification of movies and categories.

Table 5 presents the identification of sites, movies, users, and reviews.

Table 6 presents the movie review from the movie website CinemaBlend.

Movie reviews from movie websites Moviefone and Rotten Tomatoes were also fetched, and semantic scores and emotions were computed.

Table 7 presents the movie review tokenization and tagging from movie websites CinemaBlend, Moviefone, and Rotten Tomatoes.

The parsing of movie reviews’ tokens taken from CinemaBlend, Moviefone, and Rotten Tomatoes was performed using the Stanford parser. Here, Table 8 presents the sentiment score of movie reviews from CinemaBlend, Moviefone, and Rotten Tomatoes.

Table 9 presents the normalized Twitter likes of movies from Twitter.

Table 10 presents the normalized rating score of movies from CinemaBlend, Moviefone, and Rotten Tomatoes.

Table 11 presents the normalized vote score of movies from CinemaBlend, Moviefone, and Rotten Tomatoes.

Table 12 presents the final score, movie category, medal rank, and genres of movies from CinemaBlend, Moviefone, and Rotten Tomatoes.

Figure 6 presents the multivariate movie ranked recommendation of movies from CinemaBlend, Moviefone, and Rotten Tomatoes.

Figure 7 presents differences in the rating of the movie from CinemaBlend, Moviefone, and Rotten Tomatoes.

Figure 8 presents differences in the votes of the movie from CinemaBlend, Moviefone, and Rotten Tomatoes.

Figure 9 presents differences in the sentiment of the movie from CinemaBlend, Moviefone, and Rotten Tomatoes.

6. Evaluation and Discussion

We evaluate the sentiment classification models as well as recommendation models as follows.

6.1. Sentiment Classification Model Evaluation

For sentiment, classification models are evaluated by accuracy and RMSE, which measure the overall performance of the sentiment classification model and divergence between predicted and truth ground sentiment classes, respectively. We compare the several base sentiment classification methods using three datasets imdb, yulp13, and yulp14, which contain reviews about movies using Stanford CoreNLP. Majority of the baseline sentiment classification models refer to categorization of document sentiments in the training set by an SVM classifier with unigram, bigram, and trigram. Text feature extraction including character n-gram and -word is done by the SVM classifier. Use of leniency feature is extracted by UPF [66]. Document representation is obtained by AvgWordvec, which nourished into SVM. Feature generation is by SSWE (sentiment-specific word embedding) [67]. Sentence representation is by the RNTN (recursive neural tensor network) [68]. Document classification is by paragraph vector: distributed memory model (PVDM) [69], topic modeling, and collaborative filtering JMARS (https://jmars.asu.edu/). Sentiment classification is by vector representation and text preference matrix for the user product neural network (UPNN) [70].

Table 13 presents the comparison between different sentiment classification models using and without using users/product/movies information.

In our approach, the core implementation is neural sentiment classification (NSC) using local user and movie information [71], which provides the significant result, as shown in Table 13, which represents the significant 4% improvement/difference with all the baseline methods, which use the local textual information about users and movies. While using global information about users and movies, the UPNN gains 3% improvement, and our approach NSC-UMA achieves 9% improvement. Our approach uses the vector for embedding the user and movie information, which is suitable for larger datasets, while the UPNN uses the matrix and vector simultaneously. NSC-UMA is considerable for capturing the information from each semantic layer. Therefore, our model incorporates using user movie global information in an efficient and effective way.

Word-level attention and sentence-level attention are considerable to outperform to reflect the semantic information of user and movie characteristics at multiple levels, which leads to introduction of the user movie attention (UMA) in sentiment classification. Furthermore, perceptions of user taste or preferences are more understandable than movie attributes, so both user and movie information is essential to pay attention in the document for semantic information which impacts movie ranking for recommendation.

6.2. Comparative Analysis of Recommendation Models

Table 14 presents the comparison between different models. Major differences which show our work as a novel approach are that the first one is LSTM-UMA for sentiment classification, the second one is the NoSQL distributed environment to deal with the big data issues, the third one is the multivariate (qualitative and quantitative) score fetched by a web bot from three different reliable external data sources, and the fourth one is app features (movie category and popularity).

In [15] which only uses the implicate and explicate ratings, no user and production attention are used by LSTM, adoptive deep learning is not used to determine the preference and taste of users about a movie from microblogs, so we can say that the authors did not use the qualitative data. It does not declare how data are fetched nor categorized with their popularity. In [15, 7274], multivariates are not used, and the study [73] is just based on microblogs, while the study [74] uses movie feature ratings.

6.3. Results of the Experiments

True positive (recommended interesting movie) predicted by a search divided by actual movies (total movies) is called precision:

True positive (recommended interesting movie to users) predicted by a search divided by predicted movies (totally recommended movies) is called recall:

Figure 10 presents different decisions made by the movie recommender.

If the recommender grows in precision, then recall is declined:

Table 15 presents the comparisons of the parameters of the multivariate recommendation system and other models.

Table 16 presents the evaluation of the multivariate recommender system with other parameters and other works.

The results of F score in our system are compared with other predicting parameters as well as with other recommendation frameworks. The accuracy of the multivariate system is nearly about 98.70%. The true positive rate of the multivariate system is 0.99106, which means the system recommended movies truly interested for users, and the false positive rate is 0.01814, which means the multivariate system did not recommend movies truly not interested for users.

Figure 11 presents differences in different decision parameters (precision, recall, and accuracy) for recommendation of movies from movie sites CinemaBlend, Moviefone, and Rotten tomatoes and other recommendation models.

Figure 11 justifies the difference between our approach and other works [21, 74]. In [21], just finding the polarity of the term is not enough to evaluate the reviews for significant recommendation, and in [74], LSTM is used to determine the user and movie information, while we used NSC-UMA to evaluate the sentiment score of reviews for significant recommendation.

7. Conclusion

In many movie recommendation systems, suggestion and ranking are done on the bases of only likes or ratings. Furthermore, most of the systems extract data from only one or two sites. Semantics, ratings, or votes for the movie ranking are not reliable and insignificant as they could not provide better recommendation services and there is a huge gap between statistical information (ratings, votes, and likes) and reviews of movie websites; so they are not reliable using from one site as a few of the websites are producing the qualitative score showing high popularity of a movie and other ones are showing low popularity of the same movie. In study [21], deep learning is not used and only word frequency is used. Word frequency is no better way to evaluate the reviews semantically. It only produces the polarity of the term. Therefore, significant values and semantic information are required in an efficient way using the LSTM-attention learning algorithm for better semantic analysis, so semantic emotional value increases the significance of the recommendation system. Document semantic classification is improved by using the user movie attention at the word and sentence levels by the average pooling of word and sentence level to improve the semantic and emotion information about reviews or document. The reason for applying the attention at the word level is to improve the semantic information of the document as compared to only applying it at the sentence level. Big data issue is covered by adopting the NoSQL environment.

8. Future Work

This work may be enhanced by adding more parameters like session, playlist, users group, session, smiley, tag, context, the feature of movie and video content to improve the work.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.