With the development of smart devices and mobile communication technologies, e-commerce has spread over all aspects of life. Abnormal transaction detection is important in e-commerce since abnormal transactions can result in large losses. Additionally, integrating data flow and control flow is important in the research of process modeling and data analysis since it plays an important role in the correctness and security of business processes. This paper proposes a novel method of detecting abnormal transactions via an integration model of data and control flows. Our model, called Extended Data Petri net (DPNE), integrates the data interaction and behavior of the whole process from the user logging into the e-commerce platform to the end of the payment, which also covers the mobile transaction process. We analyse the structure of the model, design the anomaly detection algorithm of relevant data, and illustrate the rationality and effectiveness of the whole system model. Through a case study, it is proved that each part of the system can respond well, and the system can judge each activity of every mobile transaction. Finally, the anomaly detection results are obtained by some comprehensive analysis.

1. Introduction

E-commerce has developed rapidly and is now in a golden age of digital economy. E-commerce has been used by every individual or company conforming to the times to sell or buy goods and services in the form of electronic payment [1]. According to statistics, the amount of mobile transactions is increasing significantly year by year [2]. However, anomaly events and attacks always occur in the transaction system, especially in mobile applications [3, 4], and it is difficult for an e-commerce system to deal with fraud [5]. That is because a component is often shared by multiple software, and many threats and attacks are related to them, such as sniffing, spoofing, and malware [6]. Moreover, if malicious attackers take advantage of the knowledge gap of multiple participants, it is difficult to ensure the security of the payment processes [7, 8]. Therefore, how to ensure a healthy and security trading environment is always a key consideration for e-commerce platforms. What is more, the development of a comprehensive fraud detection system is of great importance to many organizations and companies [1, 9]. Many fraud detection methods of electronic transactions have been proposed based on that.

Abnormal transactions or frauds mainly refer to the use of some means to defraud money or commodities [7]. The task of detecting abnormal transactions by machine learning can be seen as a binary classification problem, and this kind of method relies on historical data. The issues such as skew distribution and concept drift present challenges for machine learning technology to extract meaningful patterns from the historical data [5]. Some complex machine learning methods such as support vector machines, neural networks, and deep learning are black box that means it is difficult for them to interpret fraud patterns, while interpretability is the key to designing fraud prevention mechanisms [9]. Zheng et al. proposed an improved machine learning method of TrAdaBoost (ITrAdaBoost), which is more suitable to deal with the concept drift problem under different data distributions [10]. It is important and necessary to use technology that helps detect anomalies and prevent fraud. Fraud prevention includes taking measures to prevent fraud from occurring, or respond quickly to fraud and prevent losses in time [2]. However, real-time fraud prevention is not easy [2]. Data mining techniques make it possible to find valuable patterns in data sets. However, some data mining methods are not process centric [11].

Electronic commerce systems are complex, highly concurrent, and distributed. The formal analysis method is an efficient knowledge representation and discovery tool, such as formal concept analysis (FCA) [12]. Data errors and state inconsistencies are often inseparable from the abnormal events in an online shopping system [13]. Petri net and its extensions are useful tools for modeling and simulating business processes, which can describe and reflect the trading process of multiparties dynamically [14, 15]. However, the general Petri nets cannot exactly express the data operations of concurrent read, coverable write, and some other operations well. That is why a Petri net with data operations (PN-DO) is promoted [16]. PN-DO can analyse data operations in a concurrent system and check their data inconsistency, missing data, and other data flow errors. Due to the fact that a transaction fraud detection system is inseparable from data-driven methods [9], it requires data analysis methods to obtain data attributes. Data Petri net (DPN) is a kind of net that adds some data attributes to a Petri net, and it has been widely applied to different business process optimizations [17]. Because the data flow of the e-commerce model often has a large number of concurrent read and write operations, we should extend DPN to model data operations besides data attributes. The existing data analysis methods are not enough to cope with the abnormal data interaction among the activities of the e-commerce transaction model based on the process perspective and the fusion detection of user behavior habits, and thus, we hope that our extension can also handle these problems.

How to analyse and predict the online transaction process in real-time is the focus of our concern. In this paper, the users with abnormal behavior combinations in the overall process of electronic transactions are regarded as abnormal users. We take e-commerce fraud detection as a background for modeling and case analysis. This paper mainly researches on the data interaction of e-commerce transactions and analyses its anomalies. An e-commerce system is a distributed system and a multiparty data interaction process. We choose Petri net as our analysis method because it is an effective modeling tool and can dynamically reflect data interactions between various activities and resources [13, 18]. Based on that, this paper combines PN-DO with DPN and proposes an extended net, i.e., Extended Data Petri net (DPNE). Because it can combine the functions of algorithms, the function of data flow analysis can be expanded, so that activities can be expressed and analysed more intuitively. This approach ties the control flow more closely with the data flow. Our research mainly focuses on the data flow analysis technology of DPNE. It is mainly used to solve the integrated data flow processing of transactions. We propose two algorithms to describe how it works, i.e., outlier extraction and abnormal order detection.

We propose an abnormal orders detection algorithm which refers to conformance checking [17] and a full sequence comparison algorithm [19] to realize dynamic full tracking of the transaction process. This mechanism can dynamically reflect the data interaction and the state of each activity from the perspective of control flow and data flow. Finally, we combine the standard model predefined by experts to analyse the comprehensive situation of all activities so as to analyse and judge whether the current transaction is abnormal or not.

The main contributions of this research are as follows: (1)PN-DO [16] and DPN [17] are employed to the formalization of e-commerce transaction systems, and their whole business process is modeled and analysed by an Extended Data Petri net (DPNE);(2)By analyzing the model structure of DPNE, the whole system is optimized. Then, some abnormal detection algorithms for integrated data flow processing of orders are proposed. Furthermore, the running states of this system are analysed by the algorithm in [20].

The paper is organized as follows. Section 2 presents an overview of the current researches status on anomaly detections and other correlation analysis methods of e-commerce. Section 3 introduces the related basic concepts and proposes DPNE. In Section 4, a pattern matching method is integrated into our conformance checking, and the working principle is described by some algorithms. Section 5 presents the method of outlier extraction and analysis proposed in the previous section through a case study in detail. Finally, the paper is summarized in Section 6.

At present, there are many kinds of researches concerning anomaly detection of the e-commerce system [2125]. There are two key points in the anomaly detection technology. One is how to establish a library of normal behavior patterns based on historical information, and the other is how to compare the current behavior patterns with normal behavior patterns [21]. Anomaly detection often uses data mining technologies, the purpose of which is to extract unknown and valuable patterns or rules from a large amount of data. The most widely used data mining algorithms include data classification, correlation analysis, and sequence mining [9, 10]. In order to cope with the diversification of users’ transaction behaviors, Zheng et al. proposed the logic graph BP (LGBP), which is a total-order-based model. It is used to represent the logical relationship of transaction record attributes [22]. With machine learning classification algorithms, the credit scoring model is used to predicting the customer default in e-commerce by analyzing the historic data [25]. However, these related studies always focus on historical data, lacking analysis of concurrent operations. E-commerce systems are full of concurrent activities. Therefore, a comprehensive method that is equipped with data analysis and real-time monitoring and investigation of each activity of control flow is still needed.

There are some studies on process modeling and analysis, e.g., Petri nets are used to build e-commerce business processes [13]. However, most of these researches are to simulate the business process of e-commerce transactions. Then, they analyse the rationality of their models so as to ensure the transaction properties of the e-commerce business process, rather than use some formal modeling tools to assist outlier analysis and decision-making [2628]. It is essential to use formal methods to model and analyse the information exchange in a trading process of e-commerce transactions. For example, Bartosz et al. [29] used Colored Petri net to model cyber threats directed at computer systems formally and then solved up-to-date problems related to threats detection and prevention. Logical Petri net (LPN) is proved to be a great tool to simulate an e-commerce system [14]. PN-DO not only retains the benefits of the prototype Petri net but also is suitable for modeling and analysis of large-scale concurrent reads and writes [16]. Related cases have proved that it is suitable to model and analyse some business processes effectively, e.g., publications and multithreads [30].

Abnormal detection is also related to conformance checking. The basic idea of process mining is to diagnose business processes by mining event logs to obtain knowledge. As one of its applications, conformance checking is widely used to investigate and quantify the discrepancies between the real execution logs and process models, so as to quantify the difference between the real behaviors and the behaviors determined by models [31]. To reveal this kind of biases, data attributes are necessary [32]. Existing conformance checking methods include conformance checking that focuses on control flows [31, 33] and techniques for diagnosing data-related deviations [3437]. However, these methods cannot be directly applied in the field of e-commerce and do not detect anomalies in multiple activities in a transaction process, such as user behavior detection or a comprehensive decision that takes into account multiple transitions [32, 34, 36, 37]. Conformance checking can analyse data flows to explain whether current data is suitable for some specific transitions or not and discover the related decision points [17]. In order to improve business processes, van der Alst et al. compared the real behaviors of an information system (or its users) with the expected behavior by delta analysis, e.g., certain activities were performed by specific users [34]. Moreover, data plays an important role. For example, routing in a process is usually determined by data, which indicates that the control flow is to some extent data dependent [32]. de Leoni et al. [32] proposed a technique to detect the conformance of data-aware process models. They used Data Petri nets (DPN) to model data variables, guards, read and write operations, and matched data attributes of a single activity. DPN incorporates various data attributes and further expands the function of conformance checking. When multiple activities with data attributes compete for the unique resource, this is a decision point, and the condition of an activity triggered by the decision point is a rule [36, 37]. Haarmann et al. [35] used Colored Petri nets to describe a decision-aware process formally and then utilized temporal logics to check compliance rules. Process mining focuses on the overall process of a transaction, and conformance checking of data flows also greatly improves the ability compared to control flows individually. However, it is not enough to directly detect and analyse the abnormal behavior in the process of e-commerce transactions.

The above studies mainly focus on the modeling and analysis of resources, data mining of historical transaction data, or matching of business rules for a single activity. Compared with them, our method is a process-based, real-time, and dynamic data analysis method. It can predict each order in time and intercept the abnormal ones. Moreover, normal process-based trading patterns of users are recorded and regarded as one of the next rules, resulting in a more personalized matching criteria for each user. Furthermore, the reference criteria of this method will intelligently improve as users’ patterns change.

Based on the e-commerce transaction interaction data, as well as various abnormal events that may exist in reality, this paper adopts formalization and data analysis methods to conduct real-time analysis and research on abnormal behaviors in the transaction process. In this study, DPNE is used to formalize an e-commerce trading and anomaly detection system. The interaction information of multiple terminals and system servers is used to detect algorithm fusion for a single activity and analyse the combination of multiple activities to complete the capture and comprehensive analysis of outliers in the whole process.

3. Preliminary

This paper mainly focuses on the whole process from activities of buyers’ logging in to the transaction done. The system runs dynamically. By modeling and analysing the interactive data in this system, the current transaction can be monitored in real time. Therefore, this system can monitor and catch suspicious anomalies during the trading process and make a timely and comprehensive judgment on the current order at the end of the transactions and intercept abnormal orders and users.

Figure 1 shows the data interaction of an e-commerce trading system, which has a multiparty participation process [38]. It consists of several different types of resources, e.g., buyers, sellers, e-commerce trading platforms, cashier servers, and third-party servers. The data of interactions between these five parties are recorded in the data center. In this section, we will introduce the Extended Data Petri net, which is used to model e-commerce transaction processes, and some relevant works can be seen in [7, 8, 38].

3.1. Modeling Approach

Petri net is widely used as the basis for the formal modeling of systems. In reality, it is usually extended to describe more complex processes, e.g., Time Petri net [39] and logical Petri net [40]. In order to effectively simulate an e-commerce transaction process, we propose Extended Data Petri net, which is based on Petri net with data operations (PN-DO) [16] and Data Petri net (DPN) [17].

Definition 1. (PN-DO). A PN-DO is a 4-tuple net and [16], where: (1) is a set of control places, and is the set of data places, (2) is a set of transitions;(3) is a set of control arcs;(4) is a set of read arcs;(5) is a set of write arcs;(6) is a set of delete arcs and PN-DO can guarantee not only the correctness of control flows but also the correctness of data flows [16]. Thus, it is more suitable to describe a process model with data interactions than traditional Petri nets.

Definition 2. (DPN). A DPN is a 8-tuple net and [17], where: (1) is a Petri net that contains a set of places, transitions, and arcs;(2) is a set of data variables;(3) is a read function, and it labels each transition with a set of variables that need to be read, which is represented as (4) is a write function, and it labels each transition with a set of variables that need to be written, which is represented as (5) is a guard function that defines canonical variable rules for each transition, which is represented as PN-DO is essentially a control-flow-oriented model. To further describe and analyse the data flows in the business process of e-commerce trading, data attributes are needed. DPN is a Petri net that introduces data attributes. It can analyse data in each activity and interpret why individual cases take a particular way [17]. In order to get a better data analysis, DPNE is defined by combining the above advantages of nets.

Definition 3. (DPNE). A DPNE is a 7-tuple , in which: (1)PN-DO is a Petri net with data operations, which is (, , , );(2) is a set of data variables that are used in the transitions; the name and number of are consistent with that of , i.e., the variables defined in PN-DO are consistent with the data variables in (3) is a function that defines the range of each value, i.e., is the domain of a variable value , and the value of all variables should be within the range defined, i.e., for each value , it satisfies (4) is a read function , which indicates the sets of defined variables that need to be read for each transition;(5) is a write function , which indicates the sets of variables that need to write for each transition;(6) is a guard function . The guard is represented by some combination rules of reading variables and writing variables;(7) is an algorithm or function, which processes a set of reading or writing variables of the transition with specified algorithm functionality, i.e., We use quadruple to describe each transition and the situation of reading and writing variables in a DPNE. Notation means transition have read function , write function , and algorithmic functional parameter . In a DPNE, we just introduce data attributes to the PN-DO model without changing the nature of its control flows. The detection of reachability and data flow errors is consistent as described in [30].

Definition 4. (Valid bindings). If a quadruple meets the following conditions, then it is considered as valid: (1) meets control enabledness [30] at a marking , i.e., , which means each of the input control places· contains tokens, and then, is enabled at the markings with control places, denoted by , if : (2) meets data enabledness [30] at a marking , i.e., , which means each of the input data places· contains tokens, and then, could be enabled under the markings with data places, denoted by , if : (3) and (4)(5)(6), i.e., the input variables of the function are predefined;(7) is evaluated as true with the input and output variables of There are some basic notations [30] as follows: (1) represents a set of control places whose postset is transition (2) represents a set of control places whose preset is transition (3) represents a set of data places that are read by transition (4) represents a set of data places that are written by transition (5) represents a set of data places that are deleted by transition Control place reflects control flow, and data place reflects data flow. A binding that satisfies the above conditions is valid. That means a transition with a valid binding can be fired, and the firing rules of DPNE are described as follows: there are two types of places in DPNE, one is the control places and the other is the data places. For transitions whose preset is control places, the enabled transitions will consume each of its token from the preset and generate one token to the postset. That is, if , , then .

For transitions whose preset is data places, there are three situations. If a transition with write-set , then will generate tokens to each data place of the write-set; if a transition with delete-set , then it will consume tokens from the data places of the delete-set, while, if a transition with read-set , it means that the token in the read-set will not be added or subtracted. And the variables related to the places can be read multiple times, and the arcs are represented by dotted lines.

These can be written formally [30] as follows:

describes a valid binding at a marking may occur. A new marking is generated after firing this transition.

In this paper, the business processes of e-commerce transaction systems are modeled and analysed by DPNE, and the detection function of users’ static and dynamic behaviors is also integrated into this process.

3.2. The DPNE Model of E-Commerce Transaction

In this subsection, we present a case of electronic trading systems and their related outlier extraction and further propose detection algorithms to illustrate how the system modeled by DPNE can catch outliers based on real-time input data.

Figure 2 shows the DPNE model of an e-commerce transaction process. This model introduces the e-commerce transaction process from the time when a buyer logs in to the client to the end of this transaction. Table 1 defines the meaning of each variable attributes. Table 2 describes the data interactions of the Petri net in Figure 1. During the whole process, the model collects and addresses data in real time. On the e-commerce trading platform, multiple buyers and sellers interact concurrently, and several servers access simultaneously. Due to the fact that we only analyse the working principle of e-commerce systems, we adopt a single-user mode.

Figure 2 concentrates on an interactive process of real-time data inputs and outputs for a buyer from the start of the logging client to the end of the order. It is modeled by DPNE. The transaction platform requires the buyer to have sufficient personal information. The buyer’s browsing, viewing, adding to the shopping cart, and other operations are recorded in the client terminal. The current operational behavior is analysed, and a warning is sent to the users who deviate significantly from the normal dynamic mode. After the order is submitted, the cashier server establishes a unique transaction certificate for it. The payment password is entered a maximum of 2 times. Otherwise, the order will be automatically terminated. After the payment is done, the information of successful payment is sent to the third-party cashier and seller. Then, the third-party cashier sends the information to the buyer. After that, the status of the order is changed to paid, and then, the submitted order and paid order are rechecked. Meanwhile, the static behavior data is matched for the current real-time static behavior. At the same time, the full sequence comparison algorithm [19] is used to process the current real-time static behavior so as to get the matching result. Finally, the overall data interaction is summarized and analysed to determine whether the current transaction is abnormal or not.

Control flows focus on the partial orders of tasks, while data flows relate to data operations. Our model can reach terminated states (transactions completed or terminated for some reason). Data flow errors are defined in [30], including missing data and inconsistent data. Data flow errors can be associated with exceptions in the process. Through the verification of the structural properties, the rationality and correctness of business process are guaranteed. They provide a theoretical basis for the follow-up work.

4. Outlier Extraction and Detection Algorithm

In this section, an algorithm based on the DPNE model is proposed on the basis of the conformance checking algorithm [41]. It is used to judge whether the current transaction is abnormal. The DPNE model is simulated by the proposed algorithm and [20] is used to analyse the running state of the whole process.

4.1. User Behavior Matching

The behavior data of users recorded by the operations of the client terminals mainly contains the information of the users’ behavior habits. These behavior data contain not only static attribute data (e.g., user ID and IP address) but also behavior data of dynamic operations. So as to make full use of data information, the characteristics of abnormal behaviors make it necessary to separate these two types of data for mining user behaviors. This is beneficial to the discovery of more subtle anomalies.

Pattern matching is an important research field for judging abnormal behaviors of users. A normal behavior pattern can be obtained by pattern mining algorithms, and then, the current behavior pattern is compared with the pattern in a normal behavior pattern library so as to achieve the purpose of detection [19]. By integrating the pattern matching algorithm into the conformance checking method of process mining, its functions are extended, and they are suitable for more application scenarios.

The user’s behavior habits are hidden in the operation data, and each user’s operation habits are different. Therefore, these operational data have a strong personality. When others use the same common devices, accounts, and IP addresses as real users to conduct e-commerce operations, they can make use of the behavioral patterns mined from the user’s historical behavior data for pattern matching. Thus, they can find the differences between their current behaviors and users’ habits.

As shown in Table 3, the user behavior data used in this method mainly falls into the following categories. Since it has a limited number of categories, the related user behavior can be marked and classified as integers.

4.2. User Behavior Analysis

Our user data is divided into static attribute data and dynamic behavior data. Thus, the detections for static attribute anomaly and dynamic behavior anomaly are carried out, respectively. After then, the existence of anomalies is judged.

Figure 3 shows the specific steps of anomaly detection for static attributes and dynamic behaviors of users [42], i.e., (1)Step 1: obtain normal historical data of users from user databases, preprocess the data, and transform it into mathematical symbols;(2)Step 2: use data mining algorithms to obtain the user’s normal data pattern library from the normal historical data and take it as the standard of normal states;(3)Step 3: make a pattern comparison between the current user data and the user’s normal data pattern library. If it is close to the normal pattern, the current user state is considered normal; otherwise, it is marked as an outlier.

With respect to the user static attributes, this paper chooses a full sequence comparison method [19] to compare the user’s current static attributes with the static attribute pattern. As for the dynamic attributes, a recursive correlation algorithm [19] is used to compare patterns. Our method fully considers the factors of subsequences and calculates the similarity of two sequences more accurately.

4.3. An Anomaly Detection Algorithm Based on the DPNE Model

Although conformance checking and abnormal detection are different concepts, they have similar definitions of the concept of outliers. Conformance checking has the function of capturing abnormal data, which brings a new vision to the traditional method of abnormal detection. The identification of nonconforming traces is clearly valuable. Abnormal or deviant behaviors occur from time to time in the process of e-commerce transactions, i.e., the login IP changes when the buyer travels to buy goods, or the buyer uses someone else’s account to pay. Moreover, there may be different judgment results in different scenarios for the same deviation. For example, a change in the transaction amount, it may be a fraud attack. It also may be a result of bargaining between a buyer and a seller, and the final payment amount is changed by the seller. Therefore, not all deviations are fraudulent.

Data-aware conformance checking is similar to the traditional consistency alignment operation in process mining [17]. Conformance checking matches each trace in the event log to the process model DPN. The difference is that our purpose is not to achieve the optimal match of the minimum cost sum to preserve the event log and model information as much as possible. In this paper, our purpose is to use DPNE for anomaly detection of e-commerce transactions. While, any activity that does not conform to the model will be considered as an outlier, and a transaction with outliers is likely to be a fraud transaction. Moreover, to achieve a more complete anomaly detection function, it is essential to integrate the relevant similarity matching algorithm in this method.

Based on the DPNE model in the whole analysis process, we need to establish two algorithms. Algorithm 1 is proposed to analyse the matching of the data of each active data flow and its guards. Then, the outliers are extracted to obtain the current data flow pattern sequences. Algorithm 2 is to make a comprehensive analysis of the current transaction data flow sequence so as to judge whether the current transaction is normal or not. These two algorithms are described in detail below.

Input:DPNE=(PN-DO, V, U, R, W, G, Φ), initial marking M0; Tx refers to the transition in Table 2, and x is the serial number, x∈[0, 31];
Output: event log L, the outlier sequences of L is stored in B;
1. L=∅, B=∅;
2. Let M0 be the root node, and mark it as “control-enabledness and data-enabledness”;
3. While “control-enabledness and data-enabledness” nodes exist Do
  Choose the “control-enabledness and data-enabledness” node as M';
  3.1 IftTM'[ct > and M'[dt > Then
     3.1.1 Ift =T6 or t=T30Then
       Go to 3;
     3.1.2 Else
    Selected the transition t and determine if it meets the guards; IfG(t)! = ∅ Then IfΦ(t) = ∅ Then Ift meets G(t) Then
                 End Else
                          Go to 3;
       Get v(r) and v(w) of t as input, and according to the predetermined algorithm of current t, use it as Φ(t) to process data; Ift meets G(t) Then
                 End Else
                          Go to 3;
         End Else
        Go to 3;
  3.2 Else
       T=T-L; L=∅; B=∅;
      Go to 2;
1. The outlier sequence set is composed of the transition set in Algorithm 1;
2. The set of user’s history transaction behavior patterns and the normal patterns , and its transition sets is represented by ;
3. The weight set of each activity is sequence , where ;
4. The weight of each normal pattern is sequence , where ;
5. is the set of all normal behavior patterns ;
6. is the set of matching results of each activity;
Output: the path to be selected by the system.
1. Calculate the matching results between and and store it in , initial ;
2. While the elements of the set N are not all traversed Do
   select from ;
  While the elements of the set are not all traversed Do
   select from ;
3. Calculate each matching results under the current normal behavior patterns by formula (2);
4. Calculate the final anomaly detection result by formula (3);
 4.1 IfThen
 4.2 Else

The first algorithm is used to do full sequence matching [19] for a single activity and to extract the matching sequence made up of multioutliers which are corresponding to the activities of control flow. To be specific, firstly, whether the current activity can occur is determined as shown in Step 3.1. If the answer is yes, then determining whether the current activity is specific or not, i.e., or . Our method defines some activities as outliers when certain they happen. It can be seen from Step 3.1.1 of Algorithm 1, we carry out special processing on activities and . That means if the dynamic behavior of the user is abnormal or there exist other unpaid orders besides the current one; then, these activities will be marked as outliers and they are set to 1 in the sequence of abnormalities. In other words, as long as any of these activities occur, they will be regarded as suspicious outliers without judgment. If the current activity is not one of the specific activities, then go to Step 3.1.2. First, whether the activity has corresponding guards will be judged. If it has guards, the variables of activity will be processed with parameter and others. For example, when the current activity is i.e., operation detection, is the “recursive correlation algorithm”. Then, the read variables will be processed by ; in the same way, is the “full sequence comparison algorithm.” And the read variables of will be processed by . Then, the guard function of it is used to judge the value of the variable after processing. If it is normal, it will be marked as 0; and if it is abnormal, it will be marked as 1. For the activity with no parameter , then, is directly used to judge its input variables. If the activity has no guard , it is uniformly marked as 0.

Finally, all the activities are judged, and then, the sequence of abnormal detection results corresponding to the current activity is obtained, which are recorded in dictionary . In addition, this paper takes e-commerce user behavior anomaly detection as the background. When our method is applied to other scenarios, the parameter can be changed depending on the scenario.

Theorem 5. Algorithm 1 is terminated.

Proof. Through the structural analysis of the model in the previous section, under any identification, the input of Algorithm 1 can reach the final state space. Algorithm 1 starts from starting place of the model and selects the enabled transition with . The algorithm judges each transition in the model through the cycle in Step 3. The event log is formed by which stores the transition path. If there is no enabled transition, indicating that the current activity cannot proceed, set path empty, then go back to step 2 and reselect the other enabled transition . This means that the “enabled” marking set is constantly updated until there are no enabled transitions, so it is finite and Algorithm 1 can be terminated.
Algorithm 1 can judge the current behavior of each activity of the e-commerce transaction model from the perspective of control flow and data flow. The path set of the current transaction and the sequence of abnormal judgment markers corresponding to each activity can be obtained. Algorithm 1 is to analyse and judge the outliers in the whole trading process. This result will be the input to Algorithm 2. The time complexity of Algorithm 1 is .

Algorithm 2 is used for processing the data extracted by Algorithm 1. It can be mainly divided into three steps. The first is to match the judgment result sequence of current activity with the standard sequence and the sequence of user history, which is the personalized judgment processing from the perspective of user habit; the second is to assign different parameters according to the severity of different activities, and carry out a weighted average of the matching results of the first step so as to obtain more comprehensive analysis results. Formula (2) is used to process the score of matches between the current sequence and several normal sequences; third, each matching score obtained by the last step is processed by formula (3) to obtain the final result.

By analyzing the matching results of the current transaction pattern and several normal patterns, the final result will be used as the criteria for selection in the process paths between and .

Theorem 6. Algorithm 2 is terminated.

Proof. The number of elements i.e., transition in the path set is limited, so it can be all traversed, and the inner loop can be ended. The number of elements i.e., normal behavior patterns in set is also limited, so it can be all traversed. Therefore, the outer loop can be ended. In summary, the two layers of loops can be ended. It is proved that Algorithm 2 can be terminated. The time complexity of Algorithm 2 is .

5. Case Study

In the previous section, the relevant data processing algorithms are introduced. In this section, we use a set of simulated event logs and data sets to replay and match the above models using Algorithm 1. As a result, a series of outliers can be obtained. Algorithm 2 is then used to determine whether the current transaction is normal or abnormal.

We analyse and judge the process data of successful transactions but fail to establish order information for the process and data of failed transactions, so we do not analyse them. For example, we now have a sequence of activities as follows:

In the standard schema, the normal schema defined by the expert is a set of 0 s. There will be some specific habits in the consumption process of buyers. Even if these habits cannot be strictly implemented under the rules defined by experts, they also have a certain reference value, which can reduce the misjudgment of users. The set of assumed normal patterns and patterns that represent the user’s current normal habits is shown below:

Assume that the weight set of each activity is sequence defined by the expert is (0, 0, 2, 0, 2, 0, 0, 6, 0, 2, 0, 5, 5, 2, 0, 2, 1, 0, 0, 3, 3, 3, 3, 3, 3, 4, 3, 0). The weight of each normal pattern sequence is 40%, 20%, 20%, and 20%, respectively. In some cases, if all the current behaviors are normal, i.e., the above sequence of activities is the same as the corresponding values defined by the expert. According to Algorithm 1, the matching result between the current activity and the preset activity can be obtained. According to Algorithm 2 and the weights which are given above, we can get an anomaly probability of about 0.115. In the same way, if all the current behaviors are abnormal, then the abnormal possibility is about 0.885.

We operate the following bindings with Algorithms 1 and 2 in turn to obtain the results of anomaly analysis after considering the three most recent patterns in the user’s history.

As shown in the above cases, the order is normal only from the perspective of control flows (including and ). After considering data attributes, some of the activities are abnormal. Then, the case is analysed, and it is found that the anomaly probability is about 0.415. Because it is greater than the defined threshold of 0.4, the activity will be chosen. That is, the order is classified as an abnormal order and the background will take steps to block this transaction. Thus, our e-commerce transaction model and abnormal order detection algorithm can judge the event log sequence and database and monitor the transaction process timely, as well as judging and intercepting the current transaction in real time.

6. Conclusions

For the good development of e-commerce, a secure transaction environment is needed as an important guarantee. Based on this research background, this paper is aimed at proposing an outlier analysis method based on the formal model and process system. A successful process system relies on effective modeling and analysis. Both control flows and data flows should be considered [27]. To further expand its data detection and analysis capabilities, this paper proposes DPNE, which is based on PN-DO and DPN. It introduces algorithms and functions into activities so that the conformance detection function can be extended. Based on the e-commerce model shown as DPNE which includes mobile transaction process, the relevant outlier extraction and anomaly orders detection algorithm are proposed. Through the algorithm, the current transaction data analysis could be used to determine whether the current transaction and the user are abnormal. Then, the data analysis technology that can analyse the anomaly of e-commerce transaction flow from the perspective of process and data is established.

In conclusion, this paper mainly carries out the construction and analysis of the transaction model and realizes some basic functions of outlier capture. The methods in this paper can also be applied to other areas, such as social networks [43]. In the future work, we will do further research on expanding the types of anomalous trades and develop more efficient algorithms to detect the anomaly of e-commerce transactions.

Data Availability

The simulated data used to support the findings of this study are available upon request to the first author.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.


This work was supported in part by the Natural Science Foundation of Shaanxi Province under Grant 2021JM-205 and by the Fundamental Research Funds for the Central Universities of China under Grant GK202003080.