#### Abstract

Modelling and predicting the suspect activity trajectory are of great importance for preventing and fighting crime in the food safety area. Combing artificial intelligence and the multiple U-model algorithm, this paper represents a novel approach to predict the suspect activity trajectory. Based on social text data, emotional assessment is conducted using the LSTM network to detect food safety criminal suspects. Activity trajectories of criminal suspects are clustered using the graphic clustering method based on the GPS data. U-model with the sliding window algorithm is proposed to model activity trajectories. Further, the multiple U-model strategy is proposed to predict the activity trajectory based on the accumulated model error of previous positions and multiple clustered trajectories. The simulation study shows that the proposed scheme can detect food safety criminal suspects and predict their activity trajectories effectively.

#### 1. Introduction

Food safety is closely related to everyone’s life, health, and social stability. With the development of modern technology and increase in cultural social conflicts, pressure caused by food safety risk prevention and control is increasing significantly for police security. For example, more than 1,200 people were infected with norovirus at the 2018 Winter Olympics in PyeongChang, which not only affected the smooth running of the game, but also caused considerable public opinion influence in the international community. In addition, according to the China Food Safety Development Report (2018), the number of food safety incidents in the country reached 408,000 in a decade from 2008 to 2017, with an average of about 111.8 incidents occurring every day.

The amount of food safety-related data being generated by the police, industry, and academia is increasing rapidly [1]. How to obtain effective criminal intelligence from vast data to prevent crime in the food safety area has become an open and widely concerned issue all over the world. With the development of machine learning and data-driven technology, information of great value can be obtained from massive process data [2, 3], such as battery health is monitored and predicted using the mean entropy and relevance vector machine from experimental data collected from Li-ion batteries [3]. On the one hand, food safety risk identification [4], risk assessment [5], and risk prewarning [6] have been widely studied using the BP neural network, data mining, Bayesian modelling, and other intelligent methods to design food safety early warning model based on food safety monitoring data, which can effectively identify and memorize the dangerous characteristics of food and can effectively predict the risk from input samples. On the other hand, criminal intelligence extraction technology from big data has also been studied in various fields. Based on fuzzy clustering technology, Win et al. [7] proposed a crime clustering algorithm to detect potential crime patterns in large-scale spatiotemporal data sets. Based on the detected crime patterns, a crime rate assessment (CRE) algorithm is further proposed to identify the crime rate of each group of locations and target types. In addition, a criminal hotspot location (CHL) algorithm is proposed to predict and highlight hotspots. In response to several challenges posed by increasing urbanization to urban management and services, especially public safety issues in cities with high crime rates, Charlie Catlett and other scholars have proposed a prediction method based on spatial analysis and autoregressive models to automatically detect high-risk crime areas in urban areas and reliably predict crime trends in each area. The result of this algorithm is a spatiotemporal crime prediction model, which is composed of a set of crime-intensive areas and related crime prediction factors. Each prediction factor represents a prediction model, which is used to estimate the number of crimes that may occur in its related area [8]. Wu et al. [9] give a comprehensive overview of location-prediction methods based on trajectory data, ranging from temporal-pattern-based prediction to spatiotemporal-pattern-based prediction. Ahsan Morshed et al. visualized crime patterns and improved their ability to accurately predict upcoming events, opening up new possibilities for crime prevention. The proposed VisCrimePredict system uses visual and predictive analysis to describe crimes that occur in a region/neighborhood. The foundation of VisCrimePredict is a new algorithm that creates trajectories from heterogeneous data sources such as open data and social media in order to report criminal incidents. VisCrimePredict gives a proof of the concept of crime prediction and uses a long-term short-term memory (LSTM) neural network to experimentally evaluate the accuracy of crime trajectory prediction [10]. Xiao et al. proposed LSTM and convolutional neural network (CNN) models to predict crime locations of suspects based on historical activity trajectory data, aiming to locate, track, monitor, or arrest the suspects [11].

Though there have been many works on obtaining criminal intelligence from big data, it is still an open problem to detect the criminal suspects, analyse their GPS data, and predict their activity trajectories in the field of food safety crime. Because of the universal approximation capability, many kinds of neural networks have been widely used to approximate the complex system and show ideal approximation results [10–12]. However, how to control system-based established neural network models still faces challenges due to the nonlinearity of most neural network structures. In recent years, U-model methodology has been widely studied for facilitating a nonlinear control system design due to its ability to convert the nonlinear polynomial model into a controller output -based time-varying polynomial model [13–15]. However, as far as the authors know, the U-model method has not been used to model the trajectories. At the same time, for a complex system with multiple operating conditions or multimodal, an index function is calculated to decide the most appropriate model [16, 17], or Euclidean distance is calculated to form the niching strategy so as to solve the multimodal optimization problem [18]. Aiming to predict the activity trajectories of food safety criminals, this paper combines artificial intelligence and the multiple U-model algorithm. Based on widely collected social text data, food safety-related criminal suspects are expected to be detected using natural language processing algorithms. By analysing the GPS data of detected suspects, typical criminal activity trajectories can be abstracted and clustered. Further, this paper tries to represent activity trajectories by a polynomial function and models the trajectories by the strategy of the U-model. Introducing the idea of multiple model adaptive control, based on existing activity information, trajectory prediction can be achieved according to the accumulation of model errors.

#### 2. Food Safety Criminal Detection and Activity Trajectory Clustering

##### 2.1. Detection of Food Safety Criminals Using LSTM Based on Social Data

With the advent of the self-media era, social platforms have become an indispensable part of our everyday life. In the modern police mode, how to identify persons with greater criminal tendency through social speech is the basis of crime prevention work. With the development of natural language processing technology, Chinese text sentiment analysis technology can address this issue well. To facilitate dealing of the social text, an original corpus is reconstructed to be word vectors using the word2vec model. Then, the LSTM network under the Keras deep learning framework is utilized to evaluate the emotion of a given text, screen out people with negative emotion towards crime from massive text data, and predict the possible types of crime.

###### 2.1.1. Preprocessing of Social Text

The most important step in modelling is feature extraction. In natural language processing, the most crucial question is how to express a sentence effectively in the form of numbers. An initial idea is to assign each word a unique number of 1, 2, 3, 4, …, and then treat the sentence as a set of numbers, which is the initial one-hot code unique hot code technology. Unique hot code technology has great disadvantages. Assuming that a stable model will think that 3 and 4 are very close, then sets [1, 3, 4] and [1, 2, 4] should have close evaluation results, but when the words represented by 3 and 4 have completely opposite meanings, the classification results cannot be the same. Moreover, the unique hot code negates the diversity of language semantics in principle. In order to solve the problem of semantic diversity, Google’s word2vec technology [19] came into being, which corresponded the natural language to a multidimensional vector, thus solving the problem of semantic diversity from the root. There are two important models in word2vec: the CBOW model and the skip-gram model as shown in Figure 1. CBOW is a continuous bag-of-words model. Given the neighbor words , , , and in the neighborhood of the central word (the radius is supposed to be 2), CBOW will predict the central word and give the corresponding probability. The skip-gram model is enhanced from the feedforward neural network model. The input vector represents one-hot coding of a word, and the corresponding output is the word vectors of words which is near this word.

**(a)**

**(b)**

###### 2.1.2. Detection of Food Safety Criminals

With the development of technology, artificial neural networks represented by RNN convolutional neural networks have become increasingly mature, but the foundation of their establishment is that the elements are independent of each other, and the input and output are also unconventional. However, in NLP natural language processing, various elements are connected with each other. When understanding the meaning of a sentence, it is not enough just to understand each word of the sentence in isolation. It is required to deal with the whole sequence of these words connected. The intricate structure of the human brain guarantees its memory ability. The human brain can effectively memorize the context information of a piece of text and infer from the context, so the human brain can quickly understand the language. In the NLP field, the RNN cyclic neural network can directly act on itself at the next timestamp through the output of neurons, and its output depends on the current input and the memory of the previous moment. In theory, RNN can memorize infinite information, but in practice it will encounter problems such as gradient disappearance and gradient explosion. When the amount of data exceeds a certain range. The internal structure of the RNN network is improved. This special RNN network is called the LSTM network (long-term and short-term memory network), which overcomes the shortcomings of the traditional RNN and overcomes the long-term dependence of the traditional RNN [20]. The structure diagram of the LSTM network is shown in Figure 2.

The first step of LSTM is to determine what information can be passed through the cell state. This decision is controlled by the “forget gate” layer through sigmoid, which will generate a value of 0 to 1 based on the output and current input of the previous time to decide whether to let the information learned in the previous time pass or partially pass:

To update the state of the cell, it consists of two parts. First, a sigmoid layer called the “input gate layer” decides which values will be updated from and . Next, a tanh layer creates a vector of new candidate values , which will be added to the state:

Then, state of the cell is updated by forgetting information and adding candidate values:

The output layer consists of two steps. First, a sigmoid layer decides the state feature of the output cell . Then, the cell state is put through the tanh function and multiplied by , and the final output is obtained:

As shown in Figure 3, using Chinese word segmentation technology, the original corpus is decomposed into word vectors, and the decomposition results are compared to judge the criminal types using the LSTM network.

This model uses more than 800 comment data with emotion markers for training, as shown in Table 1. Experimental results show that this model can effectively extract the emotion contained in the text and can effectively identify common criminal types. For example, when the sentence “He is always bullying me! I am going to fight back and put sleeping pills in his water.” is sent to the detection model, an output sentiment score is negative, and the system determines that this sentence involves gun and explosive crime. However, when the sentence “It’s a beautiful day today.” is sent to the detection model, the system determines that it is positive and does not involve any crime. Using this method, 57 food safety-related criminals are detected, and their corresponding activity trajectories will be extracted for further analysis.

##### 2.2. Activity Trajectory Clustering

As most food safety criminal suspects commit crimes in gangs, partners in crime show similar activity trajectories. In this part, activity trajectories are clustered according to the similarity.

Hierarchical clustering method is used to decompose or merge a given data set until some conditions are met. According to the hierarchical decomposition formed from bottom to top or from top to bottom, the hierarchical clustering method can be further divided into agglomerative and divisive hierarchical clustering.

Some classical clustering algorithms, such as *k*-means and density clustering, must first randomly assign *K* class centers, which are used by the cluster to find the final centers step by step. If the randomly selected points are not representative, the clustering effect will be relatively poor. The hierarchical clustering algorithm can effectively cluster the initial samples without specifying random center points in advance, but it also needs to specify the number of clustering categories *K* or termination conditions [21].

Track image processing: from the detected food safety-related criminals, 24 criminal suspects’ trajectory images are selected as shown in Figure 4 (due to limited space, only the first 6 images are listed), a track within the same latitude and longitude range is intercepted, and the background color is removed and white is used instead. Each track map is composed of pixels, the RGB value of background color pixels is (255, 255, 255), and the RGB value of track pixels is (0, 0, 0).

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

Determination of the minimum distance: the RGB values of all pixels in each track map are listed as a ()-dimensional vector ( number of pixels). The track map is uniquely determined by this vector, so the clustering of the track map can be transformed into the clustering of vectors. The minimum distance can be changed into the minimum Euclidean distance in -dimensional space.

Input: column vectors , , are transformed from three channels of the RGB matrix of track pictures. Number of clusters . The process of the hierarchical clustering algorithm for the trajectory graph is as follows: *Step 1*. Each track map is recorded as a category separately, . *Step 2*. The distance matrix , where denotes the Euclidean distance of n-dimensional space of every two trajectories, is calculated: *Step 3*. The current number of clusters is set as , and clustering begins. *Step 4*. While (),(a)Find the two closest clusters and , (b)Decrease the cluster number after by one(c)Delete the th row and th column of the distance matrix (d)Update the distance matrix (e)Update the current number of clusters . *Step 5*. Output the generated cluster .

In this experiment, we use clustering of aggregation type and minimum distance hierarchy to treat everyone’s track map as a cluster. And, then each cluster is traversed, two clusters with the minimum distance are found, and they are merged into a node. This node continues to traverse as a new cluster. At the same time, the distance between clusters should be large enough. Finally, all nodes forming a binary tree are shown in Figure 5. As it can be seen, binary trajectory images of 24 criminals are grouped into 3 clusters.

#### 3. Activity Trajectory Prediction Based on Multiple Model U-Model

In the above section, activity trajectories are grouped into 3 typical clusters. Given the limited trajectory states of the suspect, it is necessary to determine which cluster this trajectory belongs to as soon as possible, so that the path could be predicted and mastered by the police. First, the obtained typical trajectories are described and modelled by the U-model. Next, with the given trajectory states, further trajectory is predicted using the multiple U-model algorithm.

##### 3.1. Activity Curve Modelling Based on U-Model

The activity trace of the food safety criminal suspect can be formulated in the following discrete-time nonlinear function:where is a nonlinear function, is the output longitude, is the input latitude at discrete time , is the plant order, and is the parameter vector.

For nonlinear polynomial control systems,

According to the deification of the U-model [22], nonlinear function (7) can be mapped into the U-model as follows:

The corresponding regression equation is described aswhere is the degree of input and is the parameter vector function of the past inputs, outputs, and parameters .

As for activity curves, the current longitude is not only related to the former latitude , but also determined by the current latitude . Thus, defining and substituting it into (9), we get the U-model description for the activity curve of the food safety criminal suspect:

Similar to [23] in which measurement is used to obtain the call admission control signal, the U-model of the activity curve is expected to be established based on measurement data and the least square algorithm. As the activity curve is complicated and hard to be modelled using the single simple low-order U-model, we simply used the U-model (10) with and designed an algorithm with sliding windows to identify the parameter , .

Defining as the length of the sliding window, for the sample set , according to the principle of least squares, the objective is to minimize the following index function:

Taking the derivative of equation (11),

Rewriting the above equation set, we get

Defining , equation (13) can be rewritten in the following matrix equation form:

Solution at time can be obtained by solving the following equation set:

By this means, the U-model with sliding window can be obtained:

Furthermore, given the latitude , the corresponding longitude can be predicted. However, the prediction error increases as future latitude goes far from the current latitude. From the trace analysis of food safety criminals, three typical trajectories are extracted. Based on the idea of multiple model adaptive control, curve prediction is further conducted according to the accumulated model error of different typical trajectories.

##### 3.2. Multiple U-Model Algorithm

Given a set of trajectory values, how to decide the corresponding U-model to predict future trajectory automatically is of great importance, for the mismatched model may cause a big predict error. The basic idea is using the frame of multiple model adaptive control to decide the most matching model.

For each clustered trajectory , the corresponding U-model can be constructed by equation (16). However, as it can be seen in Figure 5, trajectories may not have their latitude values at every longitude point. The model set is completed by setting the latitude value to be zero when it is null:

For every moment, the following index function is calculated to measure the matching degree between existing trajectory states and each trajectory model:where is the forgetting factor and satisfies .

Given a serial of the trajectory state, the matching model with the smallest index function will be selected to predict the trajectory:

#### 4. Simulation

According to the clustering result as shown in Figure 5, three typical activity trajectories are extracted. Using the U-model for the trajectory tracking algorithm as in equation (16), modelling results of the three typical trajectories are shown in Figures 6–8.

As it can be seen in the modelling result figures, the proposed U-model can track the GPS activity trajectories well, and the RMSE (rooted mean square error) for the three models is , , and degree, correspondingly, which are less than 4.6 meters for real distance. Meanwhile, for gently changing curves, the proposed method can model them with relative small errors. When it turns to the corner points, such as when the longitude is around 116.321 in Figure 6, the latitude drops dramatically from 39.988 to 39.985 and the modelling error increases as the dramatically changing curve is hard to be modelled exactly by the polynomial function.

Further, given a set of GPS points randomly depicted as star points shown in Figure 9, the predicted trajectories using the multiple U-model algorithm are shown as the red line. It is easy to see that though the given GPS points may not be precisely given, the multiple U-model algorithm can effectively identify the best-matched model by calculating the accumulated errors of given states and clustered models and predict its future trajectory.

**(a)**

**(b)**

**(c)**

#### 5. Conclusion and Future Work

In order to prevent and fight food safety-related crimes, this paper proposed a prediction method of criminal activity trajectories. Based on social text data, emotional assessment is conducted using the LSTM network to detect food safety criminal suspects. Activity trajectories of criminal suspects are clustered using the graphic clustering method based on the GPS data. The U-model with the sliding window algorithm is proposed to model activity trajectories. Further, the multiple U-model strategy is proposed to predict the activity trajectory based on the accumulated model error of previous positions and multiple clustered trajectories. A simulation study shows that the proposed scheme can detect food safety criminal suspects and predict their activity trajectories.

However, trajectory prediction method proposed in this paper is only restricted to predict trajectories with increasing longitude, and how to deal with loopback trajectories using the U-model is still an open problem. In the future, we will further explore the description of trajectories using the U-model and perfect this method suitable for more general trajectories.

#### Data Availability

The MATLAB codes used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This research was supported by the National Natural Science Foundation of China under Grant no. 61873006 and 61673053, National Key Research and Development Project under Grant no. 2018YFC1602704 and 2018YFB1702704, and Beijing Municipal Natural Science Foundation under Grant no. 4204087.