Abstract

In the era of big data, the online ordering form of “Internet + traditional catering” has adapted to the needs of consumers with a fast pace of life and personalized consumption mode and is booming all over the world. However, due to the consumer information asymmetry and the lack of effective supervision, the potential food safety problems are becoming increasingly prominent. This paper comprehensively uses the social network analysis and Latent Dirichlet Allocation method to mine the text data of consumer comments on the online ordering platform and puts forward five food safety problems existing in the online ordering platform. Then, text features are extracted by using Bert, TF-IDF, Word2vec, and N-gram algorithms, and classifiers based on GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN algorithms are cross constructed to identify text reviews with potential food safety hazards. The classifier’s performance is compared and evaluated through ten-fold cross-validation, Friedman test, and confusion matrix. The research results show that the BERT-GBDT classifier has the best performance in accuracy, precision, specificity, and F1 measure value, and stability is the strongest. It has the best distinguish effect on the text of the review with potential food safety hazards.

1. Introduction

With the rise of digital technology, online meal delivery services have flourished worldwide in recent years [13]. Online meal delivery service refers to consumers’ online ordering and offline distribution of food through a third-party platform. The form of online ordering has changed the way customers interact with food services [2, 4]. This online ordering method of “mouse plus wheel” [5] and receiving instant meals brings great convenience to consumers.

At present, with the surge of Chinese consumer demand, the scale of online ordering platforms such as Meituan, ELEME, and Baidu takeout continues to expand, and the overall explosive growth of the industry [6]. In 2015, the market scale of the online ordering industry was 45.87 billion yuan. By the end of 2020, the overall scale of the takeout market reached 835.2 billion yuan, with nearly 500 million users [7]. Online ordering plays an increasingly significant role in residents’ diets.

Compared with the traditional catering industry, online ordering is an emerging catering mode of “Internet + traditional catering” [8]. It adapts to the modern pace of life, personalized consumption mode, and more convenient and fast dining for consumers. However, due to the virtual nature of the Internet, consumer information asymmetry, and some bad businesses deliberately providing false information, it is difficult for consumers to judge the quality and safety of food and face food safety risks.

Due to the backward supervision speed and means of such emerging industries, the industry has developed in an extensive and disorderly manner due to the long-term lack of supervision, constantly showing many food safety risks, resulting in a series of food safety problems [6], such as opaque food production links, lack of guarantee of the distribution process, and lack of supervision of business qualification, resulting in frequent food safety problems of online ordering; hidden dangers of food safety have become increasingly prominent. How to establish a food safety comment recognition mechanism from the perspective of consumers is the key to establish an effective takeaway food safety monitoring system [9].

A large number of consumer reviews are generated on the online ordering platform. Text reviews contain rich semantic content, such as consumers’ experiences, feelings, and preferences, which are important data for feedback on the food safety of online ordering [10, 11]. At present, review text mining has been widely used in the commercial field to improve the quality of products and services. How to apply the text mining method to quickly screen and identify potential food safety problems will be a new research perspective.

The food safety problems of online ordering platforms need to be solved urgently. First, how to mine and analyze the food safety problems of online ordering through text reviews? Second, through which text reviews mining method can quickly screen and identify the food safety problems in online ordering, to provide a practical and cheap supervision way for real-time and rapid monitoring of food safety. From the perspective of consumers, this paper intends to use text mining method to identify text reviews with food safety problems, to provide a new perspective for food safety supervision and management of online ordering.

This paper includes six parts: introduction, related research, research procedures and methods, data source and preprocessing, evaluation of experimental results, and conclusions and prospects.

At present, some scholars study the reviews of online ordering services, including the types of online ordering consumers [12]; factors affecting consumers’ satisfaction with online ordering [13]; and impact of online food distribution on health [14]. This kind of research is mainly carried out in the form of a questionnaire and from the perspective of causality. Its attribution analysis is difficult to avoid the interference of subjective factors.

In recent years, data analysis and text mining technology have been applied to food safety and other fields [15, 16]. The research results mainly focus on two aspects, one is to mine text topics, and the other is to classify and predict text sentiment.

Scholars have explored the topic mining methods of food review texts, such as Pantelidis [17] using content analysis to study the evaluation of consumers after eating in restaurants, and determined that food, service, atmosphere, price, menu, and decoration are important factors affecting dining evaluation; Zhang and An [18] took food safety incidents as research samples to identify risk factors affecting food safety. The content analysis method can analyze the quantity and quality of the reviews text, but it cannot judge food safety.

Song et al. used LDA and K-means clustering to analyze topics related to takeaway food in Sina Weibo [19]; Akila et al. used LDA theme model to analyze the problems existing in McDonald’s restaurant service [20]; Chi et al. used LDA model to compare and analyze user comment topics of different hotel reservation platforms [21], etc. As an improved model of PLAS, LDA topic model can overcome the limitations of labels, semantic fuzziness, multidimensional, and sparse link data and has a good clustering effect in text topic analysis [22, 23].

Scholars study the classification and prediction of text sentiment and provide relevant suggestions for products, services, organizations, etc. by mining the views and sentiments contained in the text [24, 25]. At present, scholars mainly use machine learning methods and deep learning methods, as shown in Table 1.

The research carried out by using machine learning methods includes that Barrientos et al. [26] used machine learning technology to detect whether pornographic content is contained in online text. The conclusion shows that the combination of text encoder TF-IDF and support vector machine classifier with linear kernel achieved the best performance results. Huang et al. [11] constructed a text classifier based on a support vector machine (SVM), random forest (RF), XGBoost, and GBDT algorithm and analyzed the comment text of crowdsourcing platform participants. The results showed that the accuracy of the GBDT text emotion classifier was better than the method. Zahoor et al. [27] analyzed the customer comments of different restaurants in Karachi, Pakistan, and the results showed that the accuracy of random forest classification was the highest. Yang Li et al. [28] analyzed the emotion of environmental public service Weibo based on LDA and XGBoost models.

In recent years, scholars’ in-depth learning has gradually been applied to the research of critical text classification, such as Anisha et al. [29] used machine learning and deep learning technology for real-time Twitter spam detection and emotion analysis; the research shows that the verification accuracy of Twitter spam classification and emotion analysis using the LSTM method is the highest. Liu et al. [30] combined Bert and BiLSTM to build a model to analyze the public opinion text of sudden public time. Duan et al. [31] proposed a Dict-Bert emotion analysis model based on the cascaded Bert algorithm and adaptive emotion dictionary. When the training set is small, it has obvious advantages over the Bert algorithm. Zeng et al. [32] extracted the word level and sentence level features of Weibo through BiLSTM, combined with the double-layer attention mechanism to learn the feature weights at all levels and then classified emotions. Wu et al. [33] analyzed the illegal comments based on cyber violence and proposed that RCNN combined with attention mechanism can be used to extract the context features of comment text based on Bert, which can improve the accuracy of model text classification. Li et al. [34] used the bidirectional long-term and short-term memory network (BiLSTM) to improve the accuracy of language emotion classification of food comment texts. This kind of method analyzes, processes, induces, and infers the emotional comment text and then draws the corresponding conclusion. Maslej-Krešňáková et al. [35] used a deep learning model to identify toxic comments on the Internet.

To sum up, scholars have carried out in-depth research on the text mining of consumer food consumption comments and achieved rich research results, but there is a relative lack of research on the online ordering form of “Internet + traditional catering.” At present, the potential food safety problems of takeout platforms are increasingly prominent, but the specific food problems of online ordering platforms are not clear, and there is a lack of effective methods to identify the potential food safety problems in online ordering.

Scholars’ analysis of text clustering has gradually developed from content analysis to the use of natural language processing and machine learning technology as the main research methods. Machine learning and deep learning technology are applied to classification and prediction research, but they are applied to different scenes, the methods of constructing feature vectors are different, and there are great differences in the accuracy of different classification technologies. Therefore, how to analyze the main problems of food safety in online ordering from the perspective of consumers through text comment mining, how to select effective text feature extraction methods, and how to use machine learning and deep learning technology to identify food safety hazards are key issues to be discussed.

This paper intends to comprehensively use social network analysis and the Latent Dirichlet Allocation (LDA) model to analyze the comment text and extract the main food safety problems existing in the online ordering platform. By combing the literature, this paper intends to use BERT, TF-IDF, Word2vec, and N-gram algorithm to extract text features; the supervised text classifier based on GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN algorithm are constructed. By evaluating the performance of different classifiers, this paper puts forward the best classification method which can effectively identify the text comments of food safety hazards on the online ordering platform.

3. Research Procedures and Methods

3.1. Research Procedures

From the perspective of consumers, this paper carries out text mining on the user reviews text data of the online ordering platform. It plans to analyze the food safety problems of online ordering platforms through social network analysis and LDA text clustering and then use text feature extraction and machine learning technology for text classification to effectively identify the food safety problems of online ordering platforms. The specific steps are as follows:

First is data collection and preprocessing. Get consumer reviews text data from Meituan, China’s largest ordering platform. Preprocess the text, including deduplication of the text, select the stop word list of Harbin Institute of Technology to remove the stop words, and use Jieba word segmentation to complete the text information word segmentation. Then, through the methods of high-frequency word analysis, social network analysis, and LDA text cluster analysis, this paper analyzes the focus and theme of food safety in user text comments, analyzes the correlation between food safety hazards, and excavates the theme of food safety problems involved in the comment text. Finally, using a variety of text feature extraction methods and machine learning algorithms, this paper puts forward the best method to identify the text comments with food safety problems and to provide a new path for the effective supervision of food safety problems in online ordering. The specific research steps of this paper are shown in Figure 1.

3.2. Research Methods
3.2.1. Text Processing Method

(1) Bert Language Model. In this paper, the BERT model is used for vector representation of short text. The Bert model adopts the bidirectional transformer encoder (Trm module) to obtain the feature representation of the text, as shown in Figure 2.

The Bert model adopts the 12- or 24-layer bidirectional transformer coding structure, in which , ,…, are the input vectors; , ,…, is the output vector after the multilayer transformer encoder [36]. Bert pretrains the model through a large-scale corpus to obtain the model network parameters suitable for general natural language processing tasks, and then pre-fine-tune the two participants in the pretraining by using the text data of the current task to make the model adapt to the current task.

In the Bert model, the Transformer encoder structure includes a self-attention mechanism and feed-forward neural network units, and the residual connection layer is designed between the units. The transformer is a self-attention-based seq2seq model [37]. Self-attention unit is the core of the transformer encoder. It calculates the relationship between each word and all words in its sentence and adjusts the weight of each word to obtain the vector expression of each word.

The input of the encoder is the word vector of the text, that is, the position of each word. The input of the self-attention unit is added and normalized to make the output have a fixed mean (size 0) and standard deviation (size 1). The normalized vector is introduced into the feedforward neural network for residual processing and normalized output, as shown in Figure 3.

The input of the BERT model is represented by the superposition of tag embedding, segment embedding, and position embedding. The tag embedding is the word vector whose flag is E [CLS], and its initial value can be generated randomly. E [SEP] is the interval flag of sentences, and segment embedding is the vector to distinguish different sentences. Position embedding represents the position information of each word in the text. Therefore, the input variables of BERT model contain not only the word meaning information of short text, but also the distinction information between different sentences and the location information of each word.

(2) Latent Dirichlet Allocation (LDA). LDA topic model is a document topic generation model developed on the basis of Probabilistic Latent Semantic Analysis (PLSA) [38]. LDA topic model uses word bag method to convert complex and unstructured user comment information into simple digital information [39]. In the LDA topic model collection, a document generation process is as follows: firstly, the topic distribution of document is sampled from the Dirichlet distribution , then the topic of the -th word of document is sampled from the polynomial distribution of the topic, and finally the words are sampled from the word distribution corresponding to the topic . LDA topic model can effectively and significantly find the topic features of short text, which is widely used.

3.2.2. Text Classification Algorithm

(1) Gradient Boosting Decision Tree (GBDT). GBDT algorithm is composed of gradient lifting algorithm and decision tree algorithm. Its core is to reduce residuals, that is, to generate a decision tree in the direction of a negative gradient to reduce the last residuals. The basic principle of boosting idea is to make the loss function decrease iteratively every time the model is established, which shows that the model is constantly improving in the direction of optimization. GBDT algorithm is to make the loss function decrease in its gradient direction. A decision tree algorithm has the advantages of low time complexity and fast prediction speed, but a single decision tree algorithm is easy to affect the final classification results because of overfitting. GBDT algorithm uses multiple classifiers to create hundreds of trees, which can minimize the degree of overfitting of the decision tree algorithm. Moreover, the design of each classifier is simple, and the training progress will be accelerated accordingly [40].

For those with sample points , the loss function under the candidate nearest neighbor set model is calculated, and the optimal parameter is the parameter that minimizes the loss function, where is the parameter in the candidate nearest neighbor set model and is the weight of each candidate nearest neighbor set model. The process of optimizing parameters is a gradient optimization process. Assuming that the -1 candidate nearest neighbor set model has been obtained, when calculating the candidate nearest neighbor set model, first calculate the gradient of the -1 candidate nearest neighbor set model to obtain the fastest descent direction. The final result of the candidate nearest neighbor set model depends on the addition of the results of multiple candidate nearest neighbor set models. The multiclassifier integration process of GBDT algorithm is as follows:

Among them, the value range of is 1 to , one can be obtained for each data point , and then the complete gradient descent direction can be obtained: where the value range of is 1 to , in order to make possible to obtain the least squares formula in the direction of :

Similarly, on the basis of parameter reduction, it can get

Finally, the result value obtained in the candidate nearest neighbor set model is the additive result of the loss in the -1 candidate nearest neighbor set model:

GBDT algorithm is suitable for processing various types of data and can adapt to a variety of loss functions. This method is suitable for low-dimensional dense data. The model has good interpretability and wide application fields.

(2) Extreme Gradient Boosting (XGBoost). XGBoost is an optimized distributed gradient lifting tree. Its algorithm idea is to generate multiple tree models with different forms through feature splitting. Through training, the decision tree is continuously added to the model to reduce the error of the previous prediction and improve the prediction accuracy. This method can support parallelization, has the characteristics of fast operation speed and good robustness, and has been deeply applied in many fields.

(3) Long Short-Term Memory (LSTM). A long short-term memory network is an improved cyclic neural network with the ability to memorize long-term and short-term information. This method solves the problem of forgetting long-term sequence information in the RNN network by constructing a memory storage unit [41] LSTM recurrent neural network mainly relies on the forgetting gate, input gate, and output gate to play a special role. When the model parameters are fixed, the results of neural units at different times can be changed dynamically, to avoid the problem of small gradient or explosion.

(4) Bidirectional Long Short-Term Memory (BiLSTM). Bidirectional long-term and short-term memory networks are composed of forwarding LSTM and backward LSTM. The BiLSTM of a single layer passes through an LSTM forward deprocessing sequence and an LSTM reverse deprocessing sequence. After processing, the outputs of the two LSTMs are spliced. This method can effectively obtain past and future information and then capture deeper context dependencies, which alleviates the situation that one-way LSTM can only serialize and process text.

(5) Convolutional Neural Networks (CNN). Convolutional neural networks include a feature extractor composed of a convolution layer and a subsampling layer (pool layer). It is a deep learning network model that uses nonlinear mapping to reduce the dimension of local domain data. It has advantages in the multidimensional data processing. A convolutional neural network reduces the number of network parameters and the risk of overfitting through local connection and parameter sharing [42].

(6) Recurrent Neural Networks (RNN). Recurrent neural networks is a kind of recurrent neural networks that takes sequence data as input, recurses in the evolution direction of the sequence, and all nodes (cyclic units) are linked in a chain [43]. This method has advantages for processing time series data and uses the memory of each weight matrix of hidden layer neurons to learn the time dependence of system state.

(7) Recurrent Convolutional Neural Networks (RCNN). Recursive convolution neural network combines cyclic neural network and LSTM network. Through the connection between hidden layer nodes, the cyclic neural network applies the previous information memory to the current output to achieve the purpose of capturing context information. However, the problems of gradient disappearance and gradient explosion will appear in the training process, so only a small amount of context information can be captured. LSTM network uses different functions to deal with the state of the hidden layer, realizes the screening of important information, and can well solve the gradient problem. RCNN combines the advantages of RNN and CNN.

4. Data Source and Preprocessing

4.1. Data Sources

According to the annual analysis 2020 of China’s Internet ordering market released by Beijing Yiguan Zhiku Network Technology Co., Ltd., China’s online ordering platforms Meituan and Eleme account for more than 90% of China’s takeout market, of which Meituan accounts for about 60% of China’s total market share. Among consumers of online ordering platforms, China’s white-collar business and student campus market are two major takeout segments, accounting for more than 80% of the takeout transaction share. Therefore, this paper selects the user text reviews of the Meituan platform business center and university gathering area as the research sample.

To ensure the representativeness of the sample, the types of businesses selected in this study include not only western restaurants, such as KFC and Starbucks, but also Chinese restaurants, such as braised chicken and spicy hot. Collect the online text reviews of users on the Meituan platform from January 1, 2020, to December 31, 2021; obtain 45327 user reviews text data; delete duplicate comments, meaningless comments containing only numbers or characters, irrelevant comments, and comments with less than 5 characters; and finally obtain 21987 valid text reviews.

4.2. Data Preprocessing

Manually mark the text reviews. In this study, the text reviews with food safety hazards are marked as “1,” and the text reviews without food safety hazards are marked as “0.”

Food safety problems include potential health consequences that may occur when eating food, or physical discomfort may occur after eating food, such as stomach discomfort, vomiting, and diarrhea. According to the provisions of the food safety law of the People’s Republic of China, food corruption, oil rancidity, mildew and insects, dirt, mixed with foreign matters, adulteration, and doping are all food safety problems.

No food safety problem that does not meet the above conditions. Other situations that fail to meet the expectations of consumers in the process of ordering services, such as slow delivery speed, poor delivery attitude, untimely response, and missing and wrong delivery of food, are not potential food safety hazards.

Select NLPIR Chinese analysis system to segment the collected takeout platform text comments. Review text is a short spoken text. After word segmentation, a large number of special characters, strings, numbers, and English words appear. These words have the characteristics of ultrahigh frequency or ultralow frequency. This paper deletes these strings as stop words. In addition, remove the connectives, exclamations, pronouns, etc. irrelevant to the transaction content, such as “ah,” “but,” and “so,” to obtain the review text feature set.

5. Evaluation of Experimental Results

5.1. Analysis of Food Safety Problems
5.1.1. Word Frequency Analysis of Reviews Text

Based on the text features extracted from the consumer reviews text of the online ordering platform, the content and frequency of the text features are counted, and the existing hidden food safety problems are mined. To intuitively show the focus and theme of food safety concerned by consumers, this study realizes the visualization of text features by drawing word cloud pictures.

Through text tagging, a total of 3193 review texts with potential food safety problems are obtained. The characteristic words and word frequency results of reviews text involving food safety problems are counted, including nouns, adjectives, and verbs. Through the analysis, it is found that the high-frequency words are involved in food safety hazards in the reviews text of the online ordering platform. The statistical results show that the food safety hazards of the online ordering platform include “deterioration,” “fishy smell,” “mildew,” “nausea,” “stomachache,” “gutter oil,” and “sour smell,” and “mixed.” The cloud picture of high-frequency words in the review corpus is shown in Figure 4.

5.1.2. Feature Association Analysis Based on Social Network

After text feature extraction and word cloud analysis, it gets some food safety problems faced by takeout platform users, but cannot see the relationship between these factors. To find the relationship between these influencing factors, it can analyze and find the relationship between various food safety hazards through the social network analysis method. In addition, by analyzing the central node in the semantic network, the characteristics of comment text can be further clarified.

This paper analyzes the social network of comment text through ROSTCM6 (ROST Content Mining System version 6.0) content mining system. The user reviews text data after data preprocessing is imported into ROSTCM6 to generate a feature word list and a common word matrix representing the links of each keyword. This paper selects the top 100 high-frequency words in the text features for social network analysis and calculates the average degree, average weighting degree, compact centrality, average clustering coefficient, and feature vector centrality of the coword matrix. Use Gephi0.9.3 software to visualize the social network diagram of high-frequency word cooccurrence, as shown in Figure 5.

In social network analysis, each node represents a keyword, and the number on the connection between the two nodes represents the frequency of the simultaneous occurrence of the two keywords. “Deterioration” is associated with “fishy smell,” “spit out,” “half-cooked,” “very hard,” and other nodes, indicating that the two node words appear at the same time. “Vomit” and “spit out” are strongly related, and “spicy” is related to “spit out,” “leftovers,” and other nodes. Through the analysis, it is found that the centrality index of the feature vector of “deterioration” is the highest, which is 0.864, followed by sour taste and black, which are 0.792 and 0.568, respectively. It has the strongest correlation with other keywords. Food deterioration is the most common and prominent problem of takeout food safety.

5.1.3. Text Topic Analysis Based on the LDA Topic Model

Through the analysis of high-frequency words and social networks, the factors of food safety hazards are fed back, and the relationship between them is explored, but the semantics of the comment text is not further mined, and the topic model is an effective method for text semantic mining. Among them, the LDA topic model is a document topic model, which includes the three-tier structure of document, topic, and vocabulary. Therefore, it is also called the three-tier Bayesian probability model. It is one of the most convenient and effective topic models. Find the potential topic information of the problem through the LDA topic model.

Each theme extracted by the LDA topic model should be an understandable, meaningful, and compact semantic cluster, and the similarity between different topics is low. Therefore, determining the appropriate number of themes is directly related to the effect of text extraction and the interpretation of the results. In this paper, the number of themes in the LDA topic model is determined by the confusion index evaluation method.

In the LDA model, the perplexity can be understood as the uncertainty of the trained model that the document belongs to a certain theme for a document. Calculate the text feature weight through the TF-IDF method, conduct LDA model training, and calculate the text perplexity with the subject range of 1 to 20. The results show that when the number of text themes is 5, the perplexity of the model decreases to the local lowest point, and the perplexity appears as an “inflection point,” as shown in Figure 6.

When the number of themes is 6, the perplexity increases slightly and then begins to decline in waves. When the number of themes is 20, the value of the perplexity decreases to the lowest. When the number of the theme is too large, each theme is abstract, but the information overlap is high. However, when the number of text themes is too small, the information contained in each topic will be disordered. Therefore, this paper selects the theme extraction number of the LDA model as 5. That is, when LDA themes are used for text clustering, five themes can better cover the information in the text while avoiding the information overlap between themes caused by too many themes.

The LDA model is used to extract the text features of the five themes, and the keywords related to the five themes are output. Because the LDA model is an unsupervised distribution model, there are some text features not significantly related to the subject content. Therefore, this paper selects 8 words with the greatest value for each topic description as keywords. The extraction results of the subject feature words from food safety review text are shown in Table 2.

By using LDA model for text cluster analysis, five themes of food safety problems of ordering platform from the perspective of consumers are obtained, that is, causing physical discomfort, awful smell, mixed with foreign matters, inadequate delivery, and food maturity is not appropriate.

Theme one is causing physical discomfort. Through text clustering, it is found that consumers’ eating online ordering platform food causes “diarrhea,” “food poisoning,” “vomit,” and other adverse reactions, which is a prominent food safety problem faced by takeout platforms. After eating takeout, consumers have an unpleasant consumption experience, which harms their health. This theme gives feedback on the possible deterioration of the food sold by the merchants of the online ordering platform, which leads to the adverse reactions of consumers after eating the food on the takeout platform.

The second theme is the awful smell. The food sold on the takeout platform has bad smell problems such as “rancid smell,” “fishy smell,” “malodorous smell,” and “bad smell.” “Rancid smell” refers to the sour smell caused by the deterioration of leftovers or leftovers after they have been kept for a long time. The rancid smell is mainly caused by the pollution of bacteria (mainly Bacillus cereus), which are Gram-positive and prone to food poisoning. At the same time, if protein-rich foods such as pork, fish, tofu, and eggs deteriorate, the protein will be decomposed into organic amines, sulfides, stink, and aldehydes under the action of microorganisms and enzymes, resulting in a “fishy smell” and “malodorous” smell, which is easy to produce food poisoning. The theme “bad smell” essentially feeds back the problems of “food deterioration” and “mildew” in the purchased takeout products.

Theme three is mixed with foreign matters. Consumers found “hair,” “cockroach,” “steel wire,” and other foreign matters in the food purchased on the online ordering platform. According to the keywords extracted from the text of consumer comments, the hygiene condition is poor, and foreign matters are mixed in the process of food processing. “Cockroaches” are one of the four pests. Cockroaches carry a large number of bacteria and germs, and the food in their activity area may be polluted by them. Cockroach food found in consumer food will affect people’s health. The theme of this text feeds back that businesses mixed with foreign matters may have poor hygiene and food safety that does not meet the provisions of food safety supervision and management.

In the fourth theme, the distribution is not in place. In the process of food distribution on the online ordering platform, there are problems such as “cold,” “fly,” “oily,” and “spilled.” Food “all cold” is a common problem in the distribution process of online ordering platforms, especially in the distribution process of Chinese food. As the food on the online ordering platform is finally delivered to consumers, it generally goes through the processes of food packaging, takeout taking, takeout delivering, and takeout delivering to consumers. Factors such as the speed of delivery, whether the food is leaked, and the place where the food is delivered to consumers will have an impact on food safety.

Theme five is food maturity not appropriate. The text feature keywords extracted by LDA cluster analysis include “mixed,” “raw,” “blood,” and “paste flavor.” From the results of the text analysis, the food maturity discomfort of online ordering platforms mainly exists in two situations: “not cooked” and “burnt.” When food is half cooked, it may contain toxins, which will cause intestinal parasite infection, easily lead to liver damage, and are not conducive to health. Eating burnt food will also harm the body.

5.2. Distinguish Food Safety Problems

This paper takes 21987 text reviews data obtained from Meituan, the largest online ordering platform in China, as the data set; uses BERT, TF-IDF, Word2vec, and N-gram algorithms to extract text features; and constructs a supervised text classifier based on GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN algorithms. Through the four evaluation criteria of ten-fold cross-validation, confusion-matrix, Friedman test, and Kruskal-Wallis test, the performance of the classifier is analyzed and evaluated, and the best identification method of food safety problems of online ordering platform based on review text mining is proposed.

5.2.1. Ten-Fold Cross-Validation

The text data set collected by the online ordering platform is divided into the training set and test set for 10% cross-validation. By comparing the changes in classifier accuracy under different parameters, the classifier parameters are set manually. In this paper, we set the learning rate of XGBoost as 0.1, the depth of each tree as the default value of 6, and the parameter before the number of leaf nodes in the penalty term as the default value of 0; randomly select 80% of the samples; and randomly select 80% of the features to establish a decision tree. Set the learning rate of LSTM, Bi, CNLSTMN, RNN, and CRNN algorithms to 0.001 and the number of iterations to 50. This algorithm runs on Python 3.6.2 software.

Based on four different text feature extraction methods, BERT, TF-IDF, Word2vec, and N-gram, the accuracy of ten-fold cross-validation of GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN algorithm classifiers is shown in Table 3.

In terms of classifier accuracy, when Bert, TF-IDF, Word2vec, and N-gram are four different text feature extraction methods, the accuracy of classifier GBDT is the highest, which are 0.908, 0.899, 0.88, and 0.899, respectively, indicating that this method has better classification performance than the other six classifiers. The classification accuracy of deep learning algorithms including LSTM, BiLSTM, CNN, RNN, and CRNN is not ideal, and the lowest classification accuracy of Bert-BiLSTM is 0.779. At the same time, XGBoost also has good performance. The classification accuracy of Bert-XGBoost and TF-IDF-XGBoost are 0.891 and 0.892, respectively, which are higher than that of deep learning algorithms when using different feature extraction methods.

The classification accuracy of the deep learning algorithm CRNN is the least ideal. When four different feature extraction methods are used, the classification accuracy of Word2vec CRNN is the highest, which is 0.815, which is 0.093 lower than that of BERT-GBDT classifier with the best classification performance. When TF-IDF is used to extract text features, the classification accuracy of CRNN is the lowest, which is 0.798. When using LSTM for text classification, the classification accuracy of Word2vec LSTM is the highest, which is 0.871, and that of Bert LSTM is the lowest, which is 0.781. From the experimental results, in this case, compared with the integrated learning algorithm, the deep learning algorithm does not have a classification advantage. The integrated learning methods GBDT and XGBoost have higher classification accuracy when identifying whether there is food safety in the comment text.

From the perspective of text feature extraction methods, the four text feature extraction methods in this paper have different effects on the accuracy of the classifier. When using BERT method to extract text features, the highest classification accuracy of GBDT is 0.908; When TF-IDF is used to extract text features, XGBoost and BiLSTM classifiers achieve the best classification accuracy, which is 0.892 and 0.866, respectively; When TF-IDF is used to extract text features, the accuracy of LSTM and CRNN classifiers achieves the best classification effect. When using N-gram to extract text features, CNN and RNN classifiers have the best classification performance. The results show that different text feature extraction methods affect the accuracy of the classifier, and none of the four feature extraction methods is significantly better than the other methods.

The statistical significance test method is used to judge whether there are significant differences between different classifiers. The results of ten-fold cross-validation were tested by IBM SPSS statistics 25.0. The statistical results show that the chi-square value is 15.536, the degree of freedom is 6, and the asymptotic significance is 0.016, which is less than the significance level of 0.05, indicating that the seven online ordering platform food safety review text recognition classifiers selected in this paper have significant differences. The average rank of GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN algorithm classifiers are 7.0, 5.5, 3.0, 3.5, 3.25, 4.0, and 1.75, respectively, as shown in Table 2. When four different feature extraction methods are used, the overall classification performance of GBDT algorithm is the best, and it is significantly higher than the other six algorithms.

5.2.2. Confusion Matrix

The confusion matrix is an important method to judge the accuracy and robustness of the classifier. In the confusion matrix, each column of the matrix represents the predicted value, and each row represents the actual category. The confusion matrix can be used to determine the number of correctly classified samples and the number of misclassified samples. To more accurately describe the performance of the classifier, the accuracy, specificity, and F1 measure value are used for analysis.

The accuracy rate represents the ratio of the number of positive samples correctly predicted to the total number of positive samples predicted; specificity indicates the proportion of all samples correctly predicted as negative cases to all samples predicted as negative cases. Recall rate refers to the proportion of the number of positive samples with correct prediction in the number of positive samples predicted. F measure is the comprehensive evaluation index of the classification model, which is the harmonic average value of accuracy and recall rate. The harmonic parameter value is usually 1.

In this paper, the online ordering platform text reviews are divided into “food safety hazards” and “no food safety hazards.” Through different feature extraction methods, cross construct classifiers based on different classification algorithms and then put forward the best method to identify the food safety hazards of online ordering platforms through text mining. Taking the online ordering platform reviews text data set as an example, the accuracy, specificity, and F1 measure of different classifiers are calculated through the confusion matrix. The results are shown in Table 4.

In terms of accuracy, through the analysis of confusion matrix results, it is found that when using four different text feature extraction methods, Bert, TF-IDF, Word2vec, and N-gram, the accuracy rate of classifier GBDT is the highest, that is, the comment text of “no food safety hazard” is correctly classified, accounting for the highest proportion of all predicted texts of “no food safety hazard.” It shows that GBDT classifier has significant advantages in classification performance when using different feature extraction methods, obviously due to other classification algorithms, and the classification stability is the strongest. When using Bert, TF-IDF, and N-gram to extract text features, the performance of XGBoost algorithm is slightly lower than that of GBDT algorithm but better than LSTM, BiLSTM, CNN, RNN, and CRNN. The performance gap ratio of the five deep learning algorithms is not significant.

From the perspective of specificity, it is more important to accurately predict the negative samples of “potential food safety hazards.” Whether the classifier can accurately identify the comment text with “potential food safety hazards” is directly related to the actual effect of the classifier application. Through the analysis, it is found that the specificity of TF-IDF-LSTM, N-gram-LSTM, TF-IDF-BiLSTM, and N-gram-BiLSTM is 0, indicating that when TF-IDF and N-gram are used to extract text features, LSTM and BiLSTM classifiers cannot recognize the comment text with potential food safety hazards. When using Bert and Word2vec to extract text features, the specificity of LSTM and BiLSTM classifiers are 0.126 and 0.007, respectively, indicating that the classifier has improved the ability to distinguish comment texts with potential food safety hazards, but the effect is not significant.

When the four methods extract text features, compared with the other six classifiers, GBDT classifier has the highest specificity. In this case, the highest specificity of BERT-GBDT classifier is 0.531, which is 0.235, 0.344, and 0.235 higher than TF-IDF-GBDT, Word2vec-GBDT, and N-gram-GBDT, respectively. It shows that when using Bert to extract text features, GBDT classifier has significantly enhanced the recognition ability of texts with hidden dangers of food safety comments and has the best recognition effect.

Comprehensively judge the performance of comment text recognition classifier from F1 measure. Because the “accuracy rate” and “recall rate” restrict each other, the performance of the classifier is comprehensively judged by the harmonic average of the two. On the research surface, when four different methods are used to extract text features, compared with the six classification algorithms of XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN, the F1 measure value of GBDT algorithm is the highest, indicating that the performance of GBDT algorithm is the most superior. When Bert extracts text features, the F1 measure value of GBDT classifier is 0.936, which is slightly lower than the F1 measure value of TF-IDF-GBDT and N-gram-GBDT classifiers of 0.007. It shows that the recall rate of TF-IDF-GBDT and N-gram-GBDT classifiers is higher than that of BERT-GBDT, that is, the proportion of all positive samples with correct prediction in the number of positive samples is higher.

In conclusion, when using four different text feature extraction methods, Bert, TF-IDF, Word2vec, and N-gram, there are significant differences in the accuracy of GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN, while the accuracy of GBDT classifier is always the highest. In this case, machine learning algorithms GBDT and XGBoost have better overall classification performance and faster running speed than the other five deep learning algorithms. When using Bert to extract text features, the Bert-GBDT classifier has the strongest recognition ability for reviews text with food safety hazards and has the best performance in terms of accuracy, precision, specificity, and F1 measure value, with the strongest stability.

6. Conclusion and Prospects

With the increasingly prominent food safety problems faced by the online ordering platform, it is of great significance to identify the potential food safety hazards of the online ordering platform for the implementation of data-based and information-based dynamic monitoring and supervision of food safety.

This paper analyzes and identifies the food safety problems of the online ordering platform by mining the reviews text, which mainly has two contributions:

First, through the text theme analysis, this paper puts forward five food safety problems existing in the online ordering platform, that is, causing physical discomfort, awful smell, mixed with foreign matters, inadequate delivery, and food maturity is not appropriate. Using the word frequency analysis method, analyze the focus of food safety that consumers pay attention to; through social network analysis, clarify the central node of a semantic network, and find the relationship between these influencing factors; through the perplexity index evaluation method, the number of themes of LDA topic model is determined, and the semantic clusters of text features are further extracted, and the main food safety problems faced by the online ordering platform are mined.

Second, using machine learning and deep learning technology to build a text recognition model of food safety hazards on the online ordering platform, a recognition method of food safety hazards comment text based on Bert and GBDT (Bert-GBDT) is proposed. Using Bert, TF-IDF, Word2vec, N-gram algorithms to extract text features, cross-build supervised text classifiers based on GBDT, XGBoost, LSTM, BiLSTM, CNN, RNN, and CRNN algorithms, take consumer reviews text as a data set; analyze and compare using ten-fold cross-validation, Friedman test, confusion matrix, and other indicators; and use different text feature extraction methods, the effectiveness of several integrated learning algorithms and deep learning algorithms in identifying potential security risks in this case. The experimental results show that the classification accuracy of deep learning algorithm, including LSTM, BiLSTM, CNN, RNN, and CRNN, is not ideal, which is lower than the integrated learning algorithm GBDT, and the integrated learning method GBDT runs faster. The Bert-GBDT classifier, which uses Bert to extract text features, has the strongest recognition ability, the highest classification accuracy, and strong stability for the comment text with food safety hazards.

In the future, the following aspects can be studied: first, obtain more data sets of online ordering platforms, including domestic and foreign online ordering platforms, and compare different types of orders, such as Chinese food and Western food, to further verify the applicability of the model. Second, explore the optimization of the food safety problem identification method of the online ordering platform. The comment text of the online ordering platform is the unbalanced data set, to discuss the processing method of the unbalanced data set and the optimization method of constructing the feature vector. In the future, through the optimization algorithm, we can further improve the accuracy and robustness of the text prediction of food safety hazards. By identifying the text of potential safety hazards, we can effectively supervise the food safety of the online ordering platform and ensure the safety of consumers’ food consumption.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

There are no conflicts of interest to declare.

Authors’ Contributions

Y.H. contributed to the writing—original draft preparation, conceptualization, revising, and editing, X.W. contributed to the methodology, R.W. contributed to the software, J.M. contributed to the resources and data curation. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant 72101235; the State Scholarship Fund under Grant 202108330330; the China Social Science Foundation under Grant 21BGL163; Scientific Research Foundation of Zhejiang University of Water Resources and Electric Power under Grant xky2022051; Cultivation Project of Water Conservancy Digital Economy and Sustainable Development Soft Science Research Base under Grant xrj2022018; and Jiangxi University Humanities and Social Sciences Project under Grant GL18216.