Abstract

Fishing vessel monitoring systems (VMSs) play an important role in ensuring the safety of fishing vessel operations. Traditional VMSs use a cloud centralized computing model, and the storage, processing, and visualization of all fishing vessel data are completed in the monitoring center. Due to the limitation of maritime communications, the data generated by fishing vessels cannot be fully utilized, and communication delays lead to inadequate warnings in cases of fishing vessel abnormalities. In this paper, we present a real-time anomaly detection model (RADM) for fishing vessels based on edge computing. The model runs in the edge layer, making full use of the information of moving edge nodes and nearby nodes, and combines a historical trajectory extraction detection model with an online anomaly detection model to detect anomalies. The detection model of historical trajectory extraction mines frequent patterns in historical trajectories through multifeature clustering and identifies trajectories that are different from the frequent patterns as anomalies. Online anomaly detection algorithms detect anomalous behavior in specific scenarios based on the spatiotemporal neighborhood similarity and reduce the impact of anomaly evolution. Experiments show that RADM was more effective than traditional methods in real-time anomaly detection of fishing vessels, which provides a new method for upgrading the technology of traditional VMS.

1. Introduction

At present, fishing vessel monitoring systems (VMSs) are widely used in fishing vessel safety management. In VMSs, the position and status information of each fishing vessel is collected and recorded by shipborne sensors at a certain time interval and then sent back to a monitoring center, and thus a series of spatiotemporal data points will form a trajectory data set. The trajectory data set is the core data of the VMS and has very important application prospects. For example, in the process of sailing, fishing vessels may face unpredictable and/or abnormal conditions. In cases of equipment failure (radar, positioning equipment, etc.), bad weather (typhoon, etc.), or even terrorist events (such as hijacking by pirates), the monitoring center can timely identify the abnormal conditions using the trajectory data and then extract the information necessary to take measures to maintain and guarantee the safety of the fishing vessel.

In recent years, many new technologies have been applied in VMS, including big data trajectory computing and visualization [13], VMS data mining [46], machine learning [79], and maritime IoT [1013]. However, there are the following shortcomings in the anomaly detection of marine fishing vessels through a monitoring center using transmitted trajectory data:(1)Traditional VMSs use a centralized cloud computing model, and the storage, processing, and visualization of all fishing vessel data are completed in the monitoring center. The interaction between the VMS monitoring center and the fishing vessels is carried out through marine communications, but the bandwidth of marine communications is far behind that of land communication, which affects the real-time performance and accuracy of anomaly detection.(2)Traditional vessel anomaly detection methods extract frequent patterns in trajectories through spatial location, sequence features, and abnormal behavior, and they detect abnormalities based on these frequent patterns. They require a large amount of data from all vessels. When the historical trajectory data set is small or the trajectory data in some areas are sparse, it is difficult to meet the requirements of detection accuracy, so it is not suitable for edge computing models.

To solve the above-stated problems, in this paper we propose a real-time anomaly detection model based on an edge computing framework. First, fishing vessel anomalies are detected according to the fishing vessel’s behavior characteristics. Anomalies are mainly divided into spatial position anomalies and behavioral pattern anomalies. Secondly, the frequent patterns of fishing vessels are extracted through multifeature clustering, and a trajectory that deviates from a frequent cluster is recognized as an anomaly. When ignoring the time-varying evolution characteristics of trajectory flow data, a global feature model constructed using a sufficient number of historical trajectories has high anomaly detection accuracy. However, when considering the normal trajectory pattern based on the historical trajectory training set as the reference standard, new abnormal patterns and abnormal behaviors in specific scenes that may exist in the latest trajectory flow data cannot be detected accurately. Therefore, in this paper we use the communication function between the edge nodes to obtain the trajectory data of other nodes in the adjacent sea area in real-time, and then the spatiotemporal nearest neighbor similarities are combined to detect an online anomaly of the collected trajectory data set. Finally, the weighted detection results are calculated to determine whether an anomaly is present. That is, a high-precision and low-delay anomaly detection model is established in the edge layer through the cooperation between edge nodes, combined with an online anomaly detection algorithm and a historical trajectory extraction algorithm. Finally, the feasibility of the proposed model is verified experimentally.

Through this paper, we aim to make the following contributions to the state of the art:(1)To design a novel real-time anomaly detection model (RADM) for fishing vessels based on an edge computing framework to improve the real-time performance and accuracy of traditional VMSs.(2)To propose a fishing vessel trajectory anomaly detection algorithm based on multifeature clustering (VAD-MFC), which can not only combine the historical global trajectory feature model but also update the model incrementally at the edge layer.(3)To present a vessel anomaly detection algorithm based on spatiotemporal neighbor similarity (VAD-SNS), which can not only identify existing patterns but also detect new patterns and anomalous behavior in a specific scene. This model is suitable for running at edge nodes.

Maritime anomaly detection has received much attention recently, based on the premise that if a trajectory does not appear frequently or in any cluster, then it may be abnormal. The factors that affect the trajectory anomaly are not only reflected in the abnormal location but also hidden in the sequence of movements [14, 15]. In this context, we consider the behavior of a fishing boat to be abnormal if it is different from that of most fishing boats in the same sea area.

Generally, there are three types of anomalies, spatial position anomalies, sequence anomalies, and behavior anomalies [15]. Spatial position anomalies refer to anomalies based on vessel population density, that is, when the ship is located in a low-density area, this is considered to be an anomaly. The causes of an abnormal position may be that the vessel has entered a forbidden area or it is not moving in a specified area. A sequence anomaly refers to an exception based on a sequential pattern. Abnormal behavior means that the ship’s behavior pattern is different from its nearest neighbor’s trajectory with respect to some characteristic such as direction, speed, and so on. There are many reasons for abnormal behaviors; for example, some ships may not slow down after entering the port to improve efficiency.

At present, there is much research on trajectory anomaly detection [16]. According to the detection mechanism used, methods can be divided into four categories as follows: classification-based, distance-based, historical trajectory similarity-based, and grid-based.

Trajectory anomaly detection methods based on classification are mainly divided into two stages: the training stage and the detection stage. Li et al. [17] proposed an anomaly detection algorithm based on the correlation between anomaly patterns and spatiotemporal attributes. The algorithm used k-means to cluster the motifs of subtrajectory-related attributes and construct a rule-based classifier. Trajectory anomaly detection methods based on classification can achieve high accuracy when provided with an accurate training set [18]. Besides, many abnormal behaviors are unknown and change with time, so studies about online anomalous trajectory detection have been proposed [19, 20]. However, it is expensive to obtain accurate labelled data for practical applications.

Distance-based anomaly detection methods aim to detect the deviation of a trajectory in the dataset through a distance model. Knorr et al. [21] first proposed algorithms to identify abnormal trajectories through global characteristics such as trajectory, velocity, direction, and distance. However, this method is used to detect anomalies of complete trajectories, so it is only suitable for detecting abnormal behaviors whose location and behavior characteristics are completely different from those of other trajectories. Lee et al. [22] proposed a two-stage anomaly detection algorithm based on a segmentation detection framework. Yu et al. [23] proposed a strategy based on spatiotemporal nearest neighbor similarity. Wang et al. [24] presented the difference-and-intersection set distance metric to evaluate the similarity between any two trajectories. These anomaly detection methods suffer from high complexity and poor detection accuracy. Distance-based anomaly detection methods usually only focus on outlier behaviors with respect to location but ignore outlier trajectories whose behavior characteristics deviate from the nearest neighbors in space and time. Therefore, these methods are difficult to use in real-time trajectory anomaly detection.

Anomaly detection methods based on historical trajectory similarity formulate a global feature model by extracting frequent patterns from large-scale historical trajectory data and then identifying trajectory deviations from the model as anomalies. Liu et al. [25] proposed an anomaly detection algorithm for identifying spatiotemporal outliers, which could discover the causal relationship between outliers in time. Rong et al. [26] proposed a data mining approach for the probabilistic characterization of maritime traffic and anomaly detection. Belhadi et al. [27] proposed a deep learning algorithm that learns the different features of historical data to determine groups of trajectory outliers. Lei [15] constructed an anomaly detection framework based on the spatial and behavioral characteristics of trajectory data. In this class of methods, the accuracy of anomaly detection is affected by the number of historical tracks. When the historical data set is small or track data in some areas are sparse, the detection effect’s accuracy is insufficient for practical applications.

Trajectory anomaly detection methods based on grid partition divide the specified area into grid cells of equal size to detect abnormal behavior and are widely used in the field of transportation. Pang et al. [28] proposed a pattern recognition method based on likelihood ratio test statistics, which can detect “persistent” and “new” outliers in trajectory data effectively. Uniform grids have been used to represent the trajectory space and detect abnormal behaviors of taxis in real-time [29, 30]. The detection accuracy of grid-based anomaly detection methods is poor for trajectories with nonnormal distribution. Besides, these algorithms are only suitable for anomaly detection using geospatial information.

However, the above methods are based on traditional central computing architectures, and they are not optimized for edge nodes, so they are difficult to apply directly in edge computing scenarios. Moreover, in the field of marine fishing vessel management, only a few researchers have studied edge computing models of VMSs and established edge computing frameworks [31, 32], but there is no research on trajectory anomaly detection on this basis. In this paper, we present a real-time anomaly detection model for fishing vessels based on edge computing, and our models are experimentally verified to be more effective than traditional methods.

3. Study Area

The scope of this study was the sea area between 120 and 130 east longitude and 25–35 north latitude, located near Zhoushan, Zhejiang Province, China. The trajectory data set of the fishing VMS used in the experiment is shown in Figure 1. The system used BeiDou Satellite Navigation and communication technology to realize real-time monitoring of fishing boats’ status. Its functions include ship query, voyage statistics, trajectory review, alarm and rescue, and so on. From May 22, 2016, to November 8, 2018, 220 ships were tracked and recorded.

Since the anomaly state of a fishing vessel is closely related to the sequence of its trajectory, in this study we focus only on the detection of adjacent trajectory segments at a certain moment when an anomalous state of the fishing vessel trajectory was detected. Given a fishing vessel trajectory segment , denoted as , we define the spatial and behavioral anomalies of a trajectory.

A spatial anomaly considers the correlation between the adjacent trajectory segments. The distance between the current trajectory segment and other segments is calculated using a trajectory distance metric, and then the current trajectory segment’s proportion of the adjacent trajectory segments is calculated using the adjacent trajectory distance threshold. If the proportion is smaller than the anomaly threshold, then the current trajectory segment can be identified as a spatial anomaly.

A behavioral anomaly of a fishing vessel is an anomaly in the vessel’s multidimensional trajectory features, such as instantaneous angular acceleration, average angular acceleration, instantaneous speed, average speed, and acceleration. Each component receives a trajectory anomaly score by calculating the proportion of the abnormal trajectory point to the total trajectory points of the trajectory segment. Then, the component anomaly scores are integrated to obtain the final behavior anomaly score. If the score exceeds a predefined anomaly threshold, then the current trajectory is identified as a behavioral anomaly.

The abovementioned trajectory anomaly definitions are mainly aimed determining abnormalities in behavior of maritime fishing vessels. The sailing behavior of fishing vessels at sea is very different from that of general freight and passenger vessels. The latter determine the launch and destination port information before departure. The route of a voyage is sailed in accordance with established trade routes. Based on this premise, although at sea there is no physical restriction as in the case of urban road network traffic, a normal vessel does not deviate too far from these established routes. When crews find that the vessel has deviated from the route, they correct the ship’s course accordingly. Therefore, because they need to follow the determined route network, freight and passenger vessels are much like land vehicles when sailing at sea. Anomaly detection for such vessels can rely on the route information, and both the anomaly definition and anomaly detection are relatively simple. Unlike other vessels, fishing vessels do not have a defined destination area when at sea, nor do they follow a specific route, so they exhibit considerable randomness in their navigation. The sailing behavior of fishing vessels is mainly divided into two types, namely, sailing and fishing. Therefore, in this paper we make special distinctions for behavioral anomalies, as defined above. Thus, the aspects of position and behavior are considered to obtain the degree of anomaly suspicion of the fishing vessel’s current state, while fishing vessel behavioral anomaly detection specifically is further classified according to whether it pertains to fishing or sailing.

4. Methods

4.1. Framework of Real-Time Trajectory Anomaly Detection Using Edge Computing

The three-module RADM framework is shown in Figure 2 and consists of historical trajectory modeling, anomaly evaluation, and the edge computing application.

Historical trajectory modeling extracts the characteristic of differences between normal and abnormal behavior and then clusters historical trajectories to find frequent patterns, which are considered normal behavior. The anomaly evaluation module firstly identifies trajectories that are different from the frequent patterns as anomalies, while the real-time trajectory is clustered incrementally to optimize the global feature model. Meanwhile, input trajectory data from nearby fishing vessels are used to detect new anomaly patterns and anomalous behavior in a specific scene. Finally, the RADM uses a combinational algorithm to calculate a comprehensive anomaly index. The edge computing module receives the output of RADM and applies it to various applications such as safety protection and navigation of the fishing vessel.

4.2. Fishing Vessel Trajectory Anomaly Detection Based on Multifeature Clustering

First, a global feature model is formulated based on the historical trajectory data set. Supervised and unsupervised learning algorithms are two commonly used modeling methods. The trajectories of the fishing vessels studied in this paper include two parts: (1) unlabeled data with a small number of abnormal trajectories and many normal trajectories and (2) a small number of labelled abnormal trajectories. Thus, it is more suitable to use an unsupervised learning algorithm. This study is based on an edge computing framework, that is, a framework where real-time anomaly detection of fishing vessels is needed in the edge layer [33, 34]. With the continuous inflow of trajectory data, the same trajectory may be indifferent to frequent patterns at different times because the global feature model changes in real-time. Therefore, it is necessary to update the model incrementally while identifying anomalies.

To solve the above problems, in this paper we propose a vessel anomaly detection algorithm based on multifeature clustering (VAD-MFC). This algorithm extracts the characteristic quantity of obvious differences between normal and abnormal behavior and clusters historical trajectories according to these characteristic quantities to find the trajectory characteristic of normal behavior. Then, the minimum distance between the trajectory collected by the edge node in the current period and the normal behavior pattern is calculated, and this is considered the abnormality value of the current trajectory. Finally, the real-time trajectory is clustered incrementally to find a new trajectory characteristic and optimize the global feature model. It should be noted that the VAD-MFC algorithm is based on two premises: (1) the number of normal trajectories is much larger than that of abnormal trajectories and (2) there are obvious differences between the normal and abnormal behaviors of fishing vessels.

4.2.1. Frequent Area Formulation

Traditional grid-based clustering methods divide the monitored sea area into grids of equal size and then calculate the trajectory point density of each grid [35, 36]. If it is greater than a predefined threshold, then the area where the grid is located is marked as a frequent area. However, the above methods determine grid density by calculating the number of trajectory points in the grid, and thus the results of frequent regions may be lost when the object is moving too fast or the sampling interval is too large. This is demonstrated in Figure 3 [37, 38]. To avoid this kind of missing data problem, in this paper grid density is defined as the number of trajectory segments passing through using the method proposed in a prior study [15]. Each trajectory segment represents the line segment formed by adjacent trajectory points .

4.2.2. Subtrajectory Division

The suspicion level of a fishing vessel’s trajectory deviation is measured through the difference between the trajectory and the historical routes. Therefore, before anomaly detection, historical trajectories need to be clustered to obtain the historical trajectory mode. However, since the trajectory obtained in this paper is a complete trajectory with multiple stages and no markers, it is necessary to divide it to subtrajectories before modeling and ensure that only one behavior is assigned to each subtrajectory. If the historical trajectory contains the navigation stage subtrajectories, the fishing stage subtrajectories , and the anchoring stage subtrajectories , then the results of the sun trajectory division can be expressed as follows:

In this study, the spatiotemporal distance model of trajectory points and the density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm are used to classify the trajectory of fishing vessels. The basis for the classification is that the behavior of fishing vessels is continuous in time and localized in space. The algorithm first defines a spatiotemporal distance model based on the characteristics of fishing vessel behavior, and then DBSCAN clustering is applied on the trajectory points according to the model to form subtrajectory segments. For trajectory points and , the spatiotemporal distances are defined as follows:

(i) Temporal Distance. The distance of the trajectory points in time reflects their temporal correlation. Because fishing vessel behavior is continuous in time, the smaller the interval, the greater the probability of finding the same behavior. The distance definitions of and in time are given in the following equation:

(ii) Spatial Distance. The geographic distance of the trajectory points reflects their spatial correlation. Because the behavior of fishing vessels is spatially localized, that is, the trajectory falls in the same area, then the probability of the trajectory being the outcome of the same behavior is greater. The spatial distance definitions of and are given in the following equation:

(iii) Speed Distance. Because the speed of fishing boats is different during navigation, fishing, and anchoring, the larger the speed difference is, the smaller the probability the corresponding behaviors being the same. The definition of the distance between and on speed is given in the following equation:

(iv) Direction Distance. Due to the continuity of behavior in time, the distance in the direction is defined as the number of times the direction changes in a given interval.

When dividing the trajectories of fishing vessels, the four distances defined above should be considered. The distance between trajectories and is defined as

4.2.3. Trajectory Spatiotemporal Distance Model

The abnormal behavior of fishing boats is continuous in time, which is represented by several abnormal trajectory points in the trajectory data. Therefore, we should not only consider the attributes of the trajectory points themselves but also pay attention to the differences between adjacent trajectory points. This paper describes how to extract the multiple eigenvalues of a trajectory; on this basis, a spatiotemporal distance model between trajectories is defined.

When a fishing vessel’s behavior is abnormal, its speed or direction can change dramatically. The direction of the fishing vessel is also unstable during the fishing phase. Therefore, to reduce the impact of the fishing stage on anomaly detection, in this study we use the average acceleration, the average angular acceleration, and the average velocity as three characteristics of behavior. In spatial locations, fishing vessels may appear in illegal locations due to acts such as invading restricted zones, and thus the frequency is selected in this paper. To sum up, the anomaly index of the trajectory is defined in four ways as follows:

(i) Frequency. The frequency of a trajectory indicates the density of the area through which the current trajectory passes. A lower frequency indicates an inactive fishing area, and the probability of an anomaly is greater; conversely, it is assumed that the probability of an anomaly is relatively small. The specific definition is given in equation (7). Here, indicates whether trajectory is in a frequent area, with 1 corresponding to frequent areas, and 0 indicating that it is away from a frequent area:

(ii) Velocity. Velocity can be used to distinguish the behavior of fishing vessels, and it plays an important role in clustering normal behavior. Because the instantaneous velocity is random, it cannot reflect the general state of the current trajectory well, and thus, the average velocity is selected as the eigenvalue in this paper, as follows:

(iii) Acceleration. Acceleration reflects the stability of the fishing vessel. The greater the acceleration, the more unstable the behavior of the fishing vessel is and the greater the probability of abnormality. In this paper, the average acceleration is used as the fishing vessel characteristic quantity, as follows:

(iv) Angular Acceleration. Angular acceleration reflects the change in direction. Similarly to the above, the mean angular acceleration is used as the characteristic quantity of fishing vessels, as given in the following equations:

In this paper, the anomaly index of a fishing vessel is evaluated by calculating the difference between the current trajectory and the normal trajectory mode. Therefore, a spatiotemporal distance model between trajectories needs to be defined. Because the different data magnitudes of different eigenvalues can affect the calculation of distances, the data needs to first be normalized, as shown in equation (12). The spatiotemporal distances of the trajectories and are then calculated, as shown in equations (13) and (14):

4.2.4. Anomaly Detection Process

When moving to the edge layer for the real-time anomaly detection of fishing vessels, the trajectory data continue to flow in as a “trajectory flow.” To improve the performance of anomaly detection, is used to represent a basic time unit because of the insufficient information contained in the individual trajectory points. A contains more than one trajectory point. In this study, anomaly detection was performed on the incoming trajectory data in the current . Comparing the difference between and normal trajectory patterns collected in the current , the higher the difference, the greater the probability of abnormal fishing vessel behavior. Therefore, before anomaly detection, normal behavior patterns need to be obtained through historical trajectory modeling, which is the basis of trajectory anomaly detection.

(i) Modeling Historic Trajectories. Historic trajectories are clustered based on selected feature values during the modeling phase to discover frequent patterns hidden in the trajectory data set, as detailed in Algorithm 1.

Input:
 historical trajectory set
Output:
 normal trajectory model
Begin
(1)discover frequent regions
(2)get subtrajectories by dividing the training set
(3)for eachdo
(4)extract trajectory features
(5)build normal trajectory model
(6)return
End

The VAD-MFC historical trajectory modeling is divided into four main steps, as follows:

First, Determination of the Frequent Areas. This study considered the spatial location characteristics of trajectories, that is, whether or not they occurred in frequent areas. Therefore, in this step the algorithm constructs frequent regions mainly from historical trajectories, which facilitates the extraction of frequencies. Second, Division into Subtrajectories. This stage mainly groups the continuous trajectory points with the same behavior to ensure that each subtrajectory contains only one behavior to optimize the clustering effect. This stage mainly uses the method mentioned in Section 4.2, that is, clustering trajectory points based on a predefined spatiotemporal distance model. The clustering algorithm used here was DBSCAN. Third, Extraction of Subtrajectory Features. After the subtrajectories are divided, the spatial location (frequency) and behavior (average velocity, average acceleration, and average angular acceleration) characteristics of all the subtrajectories are extracted. Finally, Building a Normal Trajectory Model. In this stage, the subtrajectories are clustered based on the spatiotemporal distance model between trajectories, and the frequent pattern of trajectories, that is, the normal trajectory pattern, is extracted. Because DBSCAN can process noise points effectively and cluster quickly, it was again used to cluster historical trajectories in this stage.

(ii) Trajectory Anomaly Detection. The abnormal index of the trajectory data TR collected in the current is calculated based on the prebuilt normal trajectory model . If is greater than the anomaly threshold , then the current behavior of the fishing vessel is considered to be abnormal. Detailed steps are given in Algorithm 2.

Input:
 normal trajectory model
 trajectory
Output:
 abnormal index
Begin:
(1)trajectory feature extraction
(2)update normal trajectory model by Incremental DBSCAN
(3)for each do
(4)calculate trajectory distance
(5)get the minimum trajectory difference
(6)return
End

Trajectory anomaly detection using VAD-MFC is mainly divided into four main steps, as follows:

First, extract the characteristics of the current trajectory flow . Features extracted include the average velocity , average acceleration , average angular acceleration , and frequency . Second, update the normal trajectory model dynamically. The incremental DBSCAN algorithm is used to incrementally cluster the trajectory stream data collected in the current , so as to update the global feature model dynamically and improve the accuracy of anomaly detection. Third, calculate the distance between and all the normal trajectory modes. The calculation methods are given in equations (13) and (14). Finally, obtain the minimum difference between and the normal trajectory mode, that is, the anomaly index .

4.3. Vessel Trajectory Anomaly Detection Based on Spatiotemporal Neighbor Similarity

Anomaly detection methods based on multifeature clustering uses the normal historical trajectory patterns as the reference standard. They can detect existing patterns better, but it is difficult to detect new anomaly patterns and anomalous behavior in a specific scene. To solve these problems, we present VAD-SNS.

The VAD-SNS algorithm is based on the similarity of fishing vessel behaviors within a certain spatiotemporal range (for example, the return of a fishing vessel during a storm), and thus, if there is a significant difference between the vessel trajectory in an area and the vessel trajectory in the adjacent area, the behavior is considered abnormal. The following describes in detail how to detect anomalies in fishing vessel behavior online using spatiotemporal near-neighbor similarity.

This study builds on a framework for edge computing from previous research [32], in which fishing vessels collect positioning information in real-time through an on-board terminal device during navigation. For the positioning information collected in the current , it can be formally expressed as a series of trajectory points in chronological order, that is, , where is the fishing vessel at time stamp ’s position coordinates and . At the same time, the fishing boat sends position information to nearby nodes through edge computing nodes so that each edge node can obtain a neighbor trajectory set , where represents the position information collected within the current of node .

Because only the position information of fishing vessels in adjacent sea areas can be received passively through automatic identification systems (AIS), there may be missing data in the trajectories. To improve the accuracy of trajectory anomaly detection, the abnormal behavior of fishing vessels can be evaluated by synthesizing the abnormality degree of all trajectory segments in . That is, the segment set of adjacent trajectories for each trajectory segment is obtained based on the distance model between trajectories, and then the abnormality degree of the current trajectory segment is evaluated based on the behavioral difference characteristics in the local neighborhood. Finally, the abnormality index of the current trajectory is obtained by combining the abnormality degree of all the trajectory segments.

4.3.1. Neighbor Trajectory Segmentation

The VAD-SNS algorithm requires the spatiotemporal neighbor trajectory segment set first. The spatiotemporal local neighbors include temporally local neighbors and spatially local neighbors because each trajectory point in is collected for the current and all the trajectory segments belong to temporal neighbors. For any trajectory segment , the spatial neighbor representation of the segmented trajectory set with spacing is no greater than a given threshold , as

Here, represents the spatial distance of the trajectory segment between and . In this paper, the Hausdroff distance [39] is used, as shown in the following equations:

4.3.2. Local Anomaly Evaluation

The local outlier factor (LOF) is a density-based anomaly detection mechanism [40]. It is widely used to detect local neighborhood anomaly data because it considers the density of data samples relative to their spatial neighbors and does not require prior knowledge of the data distribution. Based on the LOF, in this method we use the local difference density (LDD) to evaluate the degree of behavioral difference between current trajectory segment and its neighbor trajectory segment set , as

Here, denotes the number of trajectory segments contained in and denotes the degree of behavioral difference between trajectory segments and . In this paper, we mainly consider the characteristics of direction and speed, as

When a fishing vessel is sailing, it is often found that there is a significant difference in the behavior of all the fishing vessels in the adjacent sea area. At this time, it will be considered that contains an exception that does not correspond to the facts. It can only be considered an anomaly if the behavior of other fishing vessels in the neighborhood is similar and there is a significant difference between the current fishing vessels. Therefore, based on the LOF, the local anomaly factor (LAF) is used to evaluate the degree of abnormality of each trajectory segment in its neighborhood [41]. Larger LAF values correspond to a greater the likelihood of abnormality of the trajectory segment, as

The LAF describes the degree of abnormality of trajectory segments within their local neighborhoods, while the collected in the current contains multiple trajectory segments. Thus, the abnormality of trajectory can be defined as

Given the time-varying evolution of the trajectory flow data, the same behavioral characteristics of fishing vessels may be abnormal within the current and but become normal in the next . Therefore, we should not only focus on the LAFs of the current but also consider the influence of the historical LAFs of the fishing vessel. Sliding window models are widely used in real-time data streaming because they can process persistent streaming data effectively while eliminating obsolete and invalid data. In this paper, we use a sliding window model based on time decay [42] to ensure data freshness.

The sliding window model based on time decay is shown in Figure 4. The size of the sliding window is N. Each time a new datum is processed, the sliding window moves one to the right, thereby removing the LAF of the farthest . A time decay function reduces the influence of historical LAFs. For all the fishing vessels, the historical LAF obtained for the previous is multiplied by a forward decay time function, and then the LAFs produced by the current are accumulated to obtain the online anomaly index of the fishing vessels. The value of the forward decay time function is the distance between the current and the leftmost of the sliding window, as defined in the following equation:

Here, is the current , is the leftmost of the sliding window, and is a monotonic nondecaying time function.

The VAD-SNS algorithm is given in Algorithm 3.

Input:
 trajectory
Output:
 abnormal index
Begin
(1)for eachdo
(2)  Obtain a time-space neighbor trajectory segment set for
(3)  Calculate the local difference of
(4)  Calculate ’s local anomaly factor
(5)  Calculate the local anomaly factors the for current trajectory flow
(6)  Obtain the anomaly index using time decay function
(7)return
End
4.4. Real-Time Anomaly Detection

RADM is set at the edge layer to improve the real-time performance and accuracy of anomaly detection, as shown in Figure 4. It can detect the current trajectory data in real-time. If abnormal behavior is detected, then it will alert and send the results to the monitoring center.

RADM uses a combination of an online anomaly detection algorithm and a historical trajectory extraction algorithm. First, normal patterns hidden in the historical trajectory set are extracted based on multifeature clustering, and the distance from normal patterns is taken as the historical anomaly index . Then, combined with the dynamic information of moving neighborhood nodes, the online anomaly index is obtained based on spatiotemporal neighborhood similarity. Finally, a comprehensive anomaly index is obtained by combining the weighted two anomaly detection algorithms:

Here, represents the weight of the online anomaly index and represents the weight of the historical anomaly index. Because the moving edge node mainly receives ship data from nearby sea areas through AIS, when a fishing vessel travels to sparse areas, data from too few trajectories may be received, which will affect the accuracy of the index. At this time, the value can be reduced appropriately, mainly through historical trajectory extraction for trajectory anomaly detection. Conversely, the value can be increased.

5. Results and Discussion

5.1. Experimental Setup

The study sea area was divided into 0.01°(latitude) × 0.01°(longitude) grids for a total of 1000 × 1000 grid squares. We installed edge computing nodes on vessels using the Zhoushan fishing VMS [32]. Using fishing vessel trajectory data collected using the Zhoushan fishing VMS from May 2016 to November 2018, the frequent regions were constructed. Grids with trajectory densities greater than 100 were marked as frequent regions, and the sampling interval between the trajectory points was 30 seconds. The results are shown in Figure 5, where black represents frequent areas and white represents infrequent areas. The experiment runs in a Windows 10 system with an Intel Core i7 4.8 GHz processor and 32 GB of RAM.

To verify the effectiveness of the RADM, we conducted comparative experiments using VAD-MFC, VAD-MFC, and RADM on a selected data set consisting of 50 normal trajectories and eight abnormal trajectories. We used the anomaly index mentioned above and the commonly used receiver operating characteristic (ROC) curve as the evaluation standard of anomaly detection.

To simulate the edge computing environment better, we interpolated the real trajectory data. The sample interval after interpolation was 2 seconds, and the following six possible anomalies were randomly added: steady direction and large fluctuations in speed, stable speed and large fluctuations in direction, large fluctuations in speed and direction, speed greater than 15 knots, avoidance of frequent areas, and abnormal behavior in a specific environment (e.g., fast transit through a port area).

The interpolated data are shown in Figure 6.

In a real environment, the number of normal trajectories is much larger than the number of abnormal trajectories. Therefore, based on the dataset’s statistics, the ratio of normal and abnormal trajectories in the experimental setup was 42195 : 1005. Edge nodes broadcasted dynamic data according to the update time (as shown in Table 1) and the communication distance of the AIS device. To simplify the calculation model, the communication distance of the experiment was set to 5 nautical miles; that is, each edge node could receive AIS information from anywhere within the surrounding 5 nautical miles. Finally, anomaly detection was performed on the received trajectory stream data.

5.2. Experimental Results

To ensure that each subtrajectory contains only independent behavior characteristics when building a normal trajectory model, VAD-MFC first partitions the trajectories according to the spatiotemporal distance model and then extracts the behavior characteristics of the subtrajectories and clusters them. Because of the large time span of the subtrajectories, the latter were divided into five-minute interval trajectory segments to produce a better clustering effect. The trajectory segments were then clustered. The clustering result is shown in Figure 7, where the x-axis represents the change of speed, the y-axis represents the change of direction, and the z-axis represents the average speed. As shown in the graph, the trajectory segments were grouped into three categories corresponding to the three normal behaviors of fishing vessels (green for navigation, orange for mooring, and blue for fishing). The clustered centroids of each class represent the frequent pattern of characteristics of the current behavior.

Figure 8 compares the frequencies of the normal and abnormal trajectories. Both normal and abnormal trajectories were mostly located in frequent areas, but there were also low frequencies (such as the No. 6 abnormal trajectory). Because the number of abnormal trajectories marked in this experiment was small and the abnormal behavior features were sparse, fishing vessels away from frequent areas may have demonstrated abnormal behavior. Frequency was used as a spatial location feature of the trajectory in the following experiments; that is, fishing boats that were far away from the frequent areas may have been identified as anomalies.

Figures 9 and 10 show the abnormal detection results of VAD-MFC and RADM, respectively. The abscissa represents different trajectory sequences, and the ordinate represents the anomaly index of the current trajectory. As shown, VAD-MFC could not detect individual abnormal trajectories, and the detection effect of RADM was better than VAD-MFC.

To evaluate the effectiveness of RADM in the edge computing framework [32], in this section we compare the anomaly detection accuracy of VAD-MFC, VAD-SNS, and RADM. Compared with other anomaly detection indicators, the ROC curve has greater tolerance to the imbalance of positive and negative samples, and thus we use the ROC curve as the evaluation standard for anomaly detection [44].

Figure 11 shows the ROC curves of VAD-MFC and VAD-SNS. Both VAD-MFC and VAD-SNS had a certain accuracy of anomaly detection, in which the area-under-curve (AUC) value of VAD-MFC reached 0.79, and the AUC value of VAD-SNS reached 0.8, which is much higher than the 0.5 obtained through random classification. When the false positive rate of VAD-MFC was 0, the true positive rate was close to 0.6; thus, VAD-MFC was better than VAD-SNS in the scenario of normal operation without false alarms. The abscissa of the equal error rate point in the VAD-SNS algorithm was smaller than that in VAD-MFC, and thus VAD-SNS was better than VAD-MFC when the probability of detecting normal and abnormal samples was equal.

The RADM combines the advantages of VAD-SNS and VAD-MFC. The results were normalized and weighted, and Figure 12 shows the accuracy of the RADM anomaly detection under different weight indexes, where the abscissa represents the weight of VAD-SNS and the ordinate represents the AUC value of RADM under the current weight. The AUC value of RADM increased with the increase of the VAD-SNS algorithm weight and then gradually decreased. The AUC value reached its maximum when the weight index was 0.29.

Figures 13 and 14 show the RADM anomaly detection results with a weighted index of 0.29. Figure 13 shows the distribution of the anomaly index, in which the red “X” represents the anomaly index of the abnormal trajectory, and the blue dot represents the anomaly index of the normal trajectory. Although the anomaly index of the normal trajectory was occasionally larger, the anomaly index of the abnormal trajectory was generally higher than that of the normal one, and the overall detection effect was better.

Figure 14 shows a comparison of the ROC curves of VAD-MFC, VAD-SNS, and RADM. The blue curve shows RADM with a weighted index of 0.29, and its AUC value was 0.92, which was much higher than the 0.79 of VAD-MFC and the 0.8 of VAD-SNS. Moreover, the abscissa of the equal error rate point of the model was also smaller than that of VAD-MFC and VAD-SNS. Therefore, RADM had higher anomaly detection accuracy.

6. Conclusions

In this study, we designed and implemented a real-time anomaly detection model for fishing vessel trajectories based on an edge computing paradigm. First, a trajectory anomaly detection algorithm based on multifeature clustering was designed according to the behavioral characteristics of fishing vessels. Then, aiming at the problem that the algorithm could not effectively detect new abnormal patterns that may have existed in the trajectory flow data and its weak real-time performance, the RADM was proposed by introducing an edge computing framework and combining it with the spatiotemporal near-neighbor similarity. Finally, experiments were carried out on real data sets. The experimental results show that RADM had high detection accuracy in detecting anomalies of continuous trajectory flow. RADM can provide a trajectory anomaly detection service in the edge layer without interaction with a terrestrial monitoring center; thus, it can improve the real-time performance of trajectory anomaly detection without the limitations of marine communications.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (no. 61972358) and the Zhejiang Province Key Research and Development Project (no. 2017C03024).