Abstract
This paper proposes an online trajectory simplification algorithm based on interval floating. The accumulated angle deviation is used in the algorithm, and the bounded error theorem of interval floating is presented. First, the accumulated angle deviation starts from the nearest reserved point. Next, the sum of the angle deviations generated by the subsequent trajectory points is continuously calculated. When the simplified threshold is reached for the first time, it is will be judged whether the simplified threshold interval needs to be floated as well as the next reservation in the floating error interval. It is worth noting that the interval between two adjacent reserved points floats only once. The algorithm is tested on real trajectory data, and the experimental results show that the algorithm has an improved simplification rate with a certain simplification error.
1. Introduction
In the 21st century, with the advent of 4G and 5G mobile communication technologies, the higher popularity of the Internet has considerably promoted social progress. Besides, big datarelated research becomes the hot spot globally. Also, mobile computing has been constantly evolving along with technologies such as mobile object databases and mobile communication network coverage. The location information of mobile terminal equipment keeps changing with time, and GPS and RFID are currently used as the main data collection equipment. Researchers employ the collected trajectory data to study the characteristics of moving objects and apply them in areas such as smart transportation and locationbased services (LBS) [1, 2]. Locationbased personalized service technology is a field that has been thriving in recent years. In terms of personalized recommendation, there are location recommendation [3], travel prediction [4], and user behavior analysis [5]. The basis of all LBS research is positioning, and a large amount of user location data needs to be obtained. However, many data in the stored massive trajectory data are of little research significance, which increases the difficulty of the subsequent scientific research analysis, causing huge capital and workload wasted in existing storage technology. In order to reduce the storage cost, more attention has been drawn to the research topic of trajectory simplification of moving objects. Many excellent trajectory simplification algorithms have been reported, but the development of trajectory simplification algorithms is still relatively slow, and each simplification algorithm has its own limitations. Many algorithms have their own specific application scenarios, and it is necessary to develop new simplification algorithms to broaden the spectrum of trajectory simplification so as to cope with various types of trajectory data sets in the future; in addition, most online algorithms use buffers since interval algorithms often have high time complexity. Therefore, it is necessary to explore new lowcost trajectory simplification algorithms.
So far, some related research has focused on the simplification algorithm of moving object trajectory. When the trajectory data simplification algorithm was first applied to computer graphics, the initially processed data only contained spatial information. Therefore, Euclidean distance (PED) was widely used. However, the data collected by the GPS system records time information in addition to space information. If we continue to use PED to study the simplified algorithm of moving object trajectories, we will inevitably lose time information. Therefore, the synchronized Euclidean distance (SED) is gradually applied to the trajectory simplification algorithm. Many studies on trajectory simplification algorithms show that researchers mainly delve into two dimensions: offline simplification and online simplification.
The advantage of online trajectory simplification is that it supports realtime applications and can compress trajectory data while picking up new trajectory points as they are acquired. Offline trajectory simplification starts compressing only after all points are acquired from the input trajectory and is suitable for analysing historical trajectory data. Offline trajectory simplification usually has fewer errors compared to online trajectory simplification. However, in many applications, the trajectory data of the moving objects arrive in a stream, such as realtime AIS information received by shorebased systems. These applications include realtime trajectory tracking and position monitoring. Therefore, some online trajectory simplification methods have been proposed to handle this situation.
The earliest offline trajectory simplification algorithm based online segments is the Douglas–Peucker algorithm (DP algorithm) proposed in 1973 [6]. The DP algorithm needs to give the threshold of the simplified algorithm in advance. The threshold uses PED, and the algorithm is simplified according to the setting. The threshold recursively selects points greater than the simplified threshold and keeps the algorithm running until all points are less than the threshold set by the simplified algorithm. The TDTR algorithm (also known as the topdown time ratio algorithm) was proposed by Meratnia and De [7]. Compared with the traditional DP algorithm, the TDTR algorithm overcomes the shortcomings of the DP algorithm, for example, losing time information. TDTR uses SED instead in the distance function of the DP algorithm because SED not only retains position but also time information compared with PED distance. Lin et al. [8] proposed the ATS algorithm. The ATS algorithm segments the original trajectory according to the important feature of the trajectory speed, calculates the SED threshold of the small trajectory, and finally uses the DP algorithm to simplify the final trajectory. Ke et al. [9] proposed the Angular algorithm, which uses the accumulated angle deviation to select or reject the trajectory point. By setting the accumulated angle threshold, the angle error before and after the trajectory simplification can be controlled. The time complexity is O (n). Opening Window (OW) is a traditional online trajectory simplification algorithm. The core idea of OW is to initialize a window with a fixed size from the starting point of the trajectory and slide the window over the points on the original trajectory. This process is repeated until the last point of the original trajectory is processed [10]. Opening Window Time Ratio (OWTR) [7] extended OPW using a synchronized Euclidean distance (SED) error instead of the spatial error. A number of successful online trajectory simplification algorithms have been proposed to simplify the road vehicle trajectories, terrain boundary line, trajectory data mining, and graphics display [11, 12]. Trajcevski et al. [13] proposed an online algorithm called Dead Reckoning, which uses the idea of estimation to estimate subsequent trajectory points. Muckell et al. [14] proposed the SQUISH algorithm, which will add new points directly to the buffer when there is still space in it. When the buffer is full, the algorithm deletes the point with the smallest error. The removal of small information points increases the importance of the left and right points of the discarded points in the area. The trajectory simplified by the algorithm has a reliable error guarantee, and the algorithm can be flexibly adjusted through the simplification rate and error. SQUISHE [14] can achieve the smallest error under the condition of an artificially given simplification rate. The worstcase time complexity of the algorithm is O (nlogn/β), β represents the artificially set simplification rate. Although it is an online trajectory simplification algorithm, in fact, under the condition of an artificially given simplification rate, the SQUISH algorithm can only perform offline trajectory simplification, and the algorithm needs to repeatedly traverse all points in the original trajectory, which is timecomplex. In the most extreme case, the error between the original trajectory and the simplified trajectory will be very considerable.
The main innovations and contributions of this paper include as follows: (1) The bounded error theorem of interval floating is proposed, which can fully simplify the trajectory with a certain simplification error. (2) An online trajectory simplification algorithm is presented and implemented, which can simplify trajectory data online. (3) Various experiments were conducted from different data set sizes and different angle thresholds to evaluate the time performance and simplification rate performance of the algorithm. Through experimental comparison, in the face of largescale trajectory data, the proposed algorithm has better time complexity and simplification rate.
2. Related Definitions and Lemmas of Algorithms
2.1. Definition 1: Angle Difference of Trajectory Segment [9]
Define the direction angle of the trajectory segment as follows: the directions of the trajectory segment and are denoted by and , respectively, and the constraints are
The angle difference formula is
For two given angles and , the magnitude of their angle difference is
The angle difference is divided into two cases, and (see Figure 1). Easy to get by definition, the range of angle difference is [0, ].
(a)
(b)
(c)
2.2. Definition 2: Angular Deviation of Trajectory Segment [9]
The angle of the moving object at point and the angle change between the points before and after it is the angle deviation of the point, which is represented by the symbol , and the details are as follows:
has a positive value and a negative value (see Figure 2), and represent the angle deviation of point and point , respectively.
2.3. Definition 3: Cumulative Angular Deviation [9]
The meaning of the cumulative angle deviation is the cumulative sum of the angle deflection of all points from the count point to the current point, which is defined as follows:
It should be noted that represents the starting point of the simplified trajectory segment , is the end point of the trajectory segment, and the subscript of the trajectory segment needs to satisfy the constraint: .
2.4. Lemma 1: Bounded Error of Position Information [15]
The trajectory simplification algorithm that retains the direction information generally achieves the purpose of simplification by constraining the direction of the trajectory, which can ensure a certain direction error. In fact, the algorithm also retains the position information while retaining the direction information. Long et al. [15] proved that the directionpreserving algorithm can maintain the position characteristics. The following introduces the bounded error lemma of position information:
In the simplified algorithm that preserves direction information, if the simplified error is within , then the shortest vertical Euclidean distance and of the original trajectory and the simplified trajectory must satisfy the following relationship:
Among them, represents the maximum length of the original trajectory segment corresponding to the simplified trajectory segment in the simplified trajectory.
Proof. Let be a simplified trajectory segment selected arbitrarily. Among them, we stipulate that is the starting point of the simplified trajectory section, and is the end point of the simplified trajectory section. Then, the simplification of the current trajectory section must satisfy the directional error within . For a simplified trajectory segment selected at random, we can construct a rhombus according to (see Figure 3), and prove the theorem in the rhombus.
In the rhombus in the figure above, there is a relationship: . Assuming that the position of is outside the rhombus, then there must be a situation where is below . Therefore, there is a difference between the unsimplified trajectory segment and the simplified trajectory segment . There must be an intersection point between them. If the position of is outside the rhombus, the direction error between the original trajectory segment and the simplified trajectory segment must be greater than . So far, the conclusion drawn is in contradiction with our original definition of the direction angle error. So it is deduced that must exist within the rhombus constructed by the simplified trajectory segment . So, it is concluded that even the point farthest from the simplified trajectory segment must satisfy the following constraints:Among them, distance represents the vertical Euclidean distance between two elements in the calculation plane, which can be the distance from point to point or point to straight line.
2.5. Lemma 2: Bounded Error of Direction [9]
The results and discussion may be presented separately, or in one combined section and may optionally be divided into headed subsections. Ke et al. [9] proposed the bounded error theorem of direction in the A algorithm and proved the theorem. The accumulated angle deviation used by the A algorithm is based on the bounded error theorem of direction. The following will prove that the accumulated angular deviation is a bounded error in direction.
Assuming that the artificially prescribed direction threshold is , for each original trajectory point , if the cumulative angle deviation of point is greater than the given , then will be retained. The direction error between the final simplified trajectory and the original trajectory must obey the following constraints:
Proof. According to the relevant definition introduced above, for a random segment of trajectory segment , the direction of the trajectory segment composed of connected trajectory points between and can be derived. The specific expression is as follows:Among them, the is the starting point, is the end point, and , , ... etc. represent the trajectory points between and .
For the random trajectory segment , the trajectory segment composed of all the adjacent trajectory points in satisfies the angle constraint (see Figure 4). In other words, for all the trajectory segments , the direction of must be within the error in the direction of the first small trajectory segment .
The vector in the rhombus (see Figure 3) can be expressed as follows:According to (11), it can be seen that the vector sum of the geometric vectors formed by all two adjacent trajectory points in can be finally expressed as , then the following constraints must be derived:According to the above proofs, the direction angle error of any deleted track segment in the original trajectory is controlled within the range of from the original trajectory.
Therefore, when running the trajectory simplification algorithm, setting the algorithm’s to half of can realize that the error between the simplified trajectory and the original trajectory is within .
3. Trajectory Simplification Algorithm Based on Interval Floating
3.1. Theorem: Bounded Error of Interval Float
This paper proposes the theorem: Bounded Error of Interval Float.
If the current trajectory point is in the interval , the algorithm performs a floating operation, that is, the accumulated angle deviation interval that needs to be discarded to simplify the trajectory floats from to ; if the current trajectory point satisfies the accumulated angle deviation is within the interval , the accumulated angle deviation If the interval floats from to , the simplified error must satisfy the constraints:
Among them, and respectively represent the minimum and maximum accumulated angle deviation values from the last retained point to the current trajectory point.
Proof. For the simplified trajectory segment in Lemma 2, if the cumulative angle deviation of the end point is in the interval , there must be a small value (see Figure 5), so that the current point satisfies the following formula:and the minimum accumulated angle deviation in the trajectory section must satisfy the following formula:Therefore, point is not necessarily the first reserved point encountered from point , and the interval of the error threshold needs to be floated up: from to . After that, continue to search for the first reserved point encountered from point , and the point whose cumulative angle deviation is greater than for the first time is reserved; in the same way, when , the interval needs to be floated down to .
The direction of the trajectory segment existing between the first and last points of the simplified trajectory that floats through the interval satisfies the formula:For the floating interval, the random trajectory segment satisfies the angle constraint in its trajectory segment :That is, the directions of the small trajectory sections between the simplified trajectory sections are all within the direction error interval of the floating interval of the first small trajectory section of the simplified trajectory section .
After floating, the simplified trajectory segment of the new end point will be obtained, and the rhombus is reconstructed according to (see Figure 6). The vector B in the rhombus can be expressed as follows:According to the abovementioned formula, the vector sum of the geometric vectors composed of all the two adjacent trajectory points between can be finally expressed as , and the satisfied constraints must be derived:In the same way, the descending provable interval must satisfy the constraint:According to the above proofs, the error between a random small track segment deleted in the middle of the simplified track segment and the original track can be guaranteed to be or .
Therefore, when running the trajectory simplification algorithm, when the cumulative angle deviation of the current point is greater than the given threshold , refinding the first reserved point after an interval float must ensure the directional error between the simplified trajectory and the original trajectory in .
3.2. Description of Interval Floating Algorithm
According to related lemmas and theorems, this paper proposes an interval floatingbased trajectory simplification algorithm for moving objects. As long as the simplified threshold is set, for each trajectory point collected, the search started from the most recently retained point each time, if there is a trajectory. If the point satisfies , the point will be reserved, which can avoid the situation that the angle deviation of the reserved point is extremely small when its accumulated angle deviation reaches the threshold. For example, we set the direction error to 0.6, and the simplified threshold value when the algorithm is running 0.3, point will be retained instead of point , but the contribution of point is much greater than that of , so we cannot use as the judgment condition (see Figure 7). However, algorithm uses as the judgment condition; if it does not exist, it will continue to find the point where the accumulated angle deviation is greater than the simplified threshold from the current point, perform an interval float from the current point, and continue to find the first point after the floating operation. If the point accumulated angle deviation exceeds the floating interval, this point must be retained. At this time, the direction interval is updated to the initial threshold interval again, that is, for each simplified trajectory, they pass the floating interval no more than once. If more than once, the bounded error between the simplified trajectory and the original trajectory cannot be guaranteed (see Algorithm 1).

4. Experiment and Discussion
In order to verify the trajectory simplification algorithm based on interval floats, this paper uses the Geolife dataset in [16–18]. The GPS trajectory dataset was collected in (Microsoft Research Asia) Geolife project by 182 users in a period of over five years (from April 2007 to August 2012). A GPS trajectory of this dataset is represented by a sequence of timestamped points, each of which contains the information of latitude, longitude, and altitude. Each file in the data selected for this experiment is larger than 30 KB. The experiment in this paper uses data sets of different sizes and different angle thresholds for experiments and analyzes the time performance and simplification rate performance of the algorithm through the experimental results.
The simplification rate of the algorithm is defined as follows:
Among them, represents the simplified trajectory, and represents the original trajectory. What we hope is that the algorithm can guarantee a low simplification rate within a certain error.
4.1. Performance Evaluation Based on Simplified Time
According to the experimental results (see Figure 8), when the data size is constant and the threshold is set very small, the simplification time of algorithm is slightly larger than that of . As the threshold increases, the simplification time of algorithm tends to decrease and gradually approach . From the perspective of the two algorithms, whether it is or , as the simplification threshold increases, the simplification time of the algorithm tends to decrease.
The size of the data set ranges from 1000 to 5000. According to the experimental results, for the two algorithms, the average simplification time (see Figures 9 and 10) of the simplification thresholds of different sizes decreases with the increase of the data set.
4.2. Performance Evaluation Based on Simplification Rate
According to the experimental results (see Figure 11), it can be concluded that under 5000 trajectories, the average simplification rate of the two algorithms tends to decrease with the increase of the simplification threshold, and the simplification rate of algorithm is lower than .
By observing the experimental results (see Figures 12 and 13), we can conclude that no matter what the simplification threshold is, the relationship between the simplification rate of the two algorithms and the data scale is that the larger the data scale is, the lower is the simplification rate.
It can be concluded from the experimental results (see Figures 14 and 15) that for the two algorithms, the simplification rate decreases with the increase of the simplification threshold among the five data sets of different sizes. Also, compared to algorithm , the highest simplification rate of algorithm is only a data size of 1000. When the simplification threshold is set to , it is slightly higher than 0.35, but for algorithm , when the simplification threshold is , the highest simplification rate is greater than 0.45, and the lowest simplification rate is also greater than 0.35. Speaking of massive trajectory data, the number of acquisition points for each trajectory is more than tens of thousands of points. In this case, the algorithm retains too many trajectories data are not conducive to efficient data storage. On the contrary, the algorithm can ensure the accurate retention of points while reducing the simplification rate.
5. Conclusions
This article mainly introduces a new trajectory simplification algorithm based on interval floating. Various experiments were conducted from different data set sizes and different angle thresholds to evaluate the time performance and simplification rate performance of the algorithm. Through experimental comparison, algorithm has outperformed algorithm in simplification rate; in addition, as the data set increases, the average simplification time of algorithm is slightly longer than that of algorithm when the simplification threshold is smaller. With the increase of the threshold, the simplification time of the algorithm is significantly reduced. Therefore, in the face of largescale trajectory data, the algorithm can show a better simplification rate and simplification time. For future works, it is planned to assess the algorithms with other datasets, considering other transportation modes and trajectories’ characteristics as well as different application scenarios.
Data Availability
The dataset used to support the results of this experiment is the GPS trajectory dataset, which is collected by Microsoft Research Asia. In more than three years (from April 2007 to August 2012), 182 users participated in the Geolife project. The data used in the algorithm experiment can be obtained through the following website: https://research.microsoft.com/enus/projects/geolife/.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.