Web Services Choreography Description Language lacks a formal system to accurately express the semantics of service behaviors and verify the correctness of a service choreography model. The paper presents a new approach of choreography model verification based on Description Logic. A metamodel of service choreography is built to provide a conceptual framework to capture the formal syntax and semantics of service choreography. Based on the framework, a set of rules and constraints are defined in Description Logic for choreography model verification. To automate model verification, the UML-based service choreography model will be transformed, by the given algorithms, into the DL-based ontology, and thus the model properties can be verified by reasoning through the ontology with the help of a popular DL reasoner. A case study is given to demonstrate applicability of the method. Furthermore, the work will be compared with other related researches.

1. Introduction

Web service technology has been popularly applied due to its power of interoperation, which allows various applications to run on heterogeneous platforms. Standards for web service composition cover two different levels of view: choreography and orchestration [1, 2]. The choreography view describes the interactions between services from a global perspective, while the orchestration view focuses on the interactions between one party and others. The web service choreography description language (WS-CDL) [3] is an XML-based language for the description of peer-to-peer collaborations of participants from a global viewpoint. However, WS-CDL is a declarative language and the specified concepts are weakly constrained. It lacks a formal system to accurately express the semantics of service behaviors and verify the correctness of a service choreography model. As a result, the built models may suffer from the problem of inconsistency, conflict, and realizability.

A number of approaches are suggested for formally modeling and verifying web services composition. Xiao et al. [4] proposed a process algebra called probabilistic priced process algebra (PPPA) for modeling and analyzing web service composition from both functionality and nonfunctionality, such as reliability and performance. Moreover, they provided a united method based on PPPA to model and analyze both functionality and QoS of web service composition. Cambronero et al. [5] presented an approach to validation and verification of web services choreographies and more specifically for composite web services systems with timing restrictions. They defined operational semantics for a relevant subset of WS-CDL and then provided a translation of the considered subset into a network of timed automata for the validation and verification using the UPPAAL tool. Zhou et al. [6] put forward an approach to testing WS-CDL programs automatically. The dynamic symbolic execution technique was used to generate test inputs, and assertions are treated as the test oracles. An engine that can simulate WS-CDL is used to execute the WS-CDL programs during symbolic execution. Besson et al. [7] adapted the automated testing techniques used by the Agile Software Development community to the SOA context to enable test-driven development of choreographies. They present the first step in that direction, a software prototype composed of ad hoc automated test case scripts for testing a web service choreography. Gu et al. [8, 9] originally advocated for a formal modeling framework, called Abstract WS-CDL. This includes grammar, congruence relations, and operational semantics. They defined a set of mappings of the Abstract WS-CDL global model to the Pi calculus-based local model and subsequently suggested a set of deductive reasoning rules of state reachability and terminability.

One of the crucial questions in choreography-based development is to check the realizability and conformance properties. Hallé and Bultan [10] proposed a novel algorithm for deciding realizability by computing a finite state model that keeps track of the information about the global state of a conversation protocol that each peer can deduce from the messages it sends and receives. McNeile [11] provided a new technique that uses compositions of partial descriptions to define a choreography, and he demonstrated that realizability of a choreography defined as a composition only needs to be established individually for the components of the composition. Basu et al. [12] gave necessary and sufficient conditions for realizability of choreographies and implemented the proposed realizability check on three granularities: (1) web service choreographies, (2) singularity OS channel contracts, and (3) UML collaboration (communication) diagrams. Yeung [13] put forward a formal approach to web service composition and conformance verification based on WS-CDL and WS-BPEL. The main contributions included a precise notion of choreography conformance upon which verification is based and support for the complementary use of visual modeling (e.g., UML) and standard WS notations in composition.

We argue that checking the realizability and conformance properties should be approached in two ways. One is to check the completeness and consistency of a choreography specification to guarantee there are no logical conflicts in the specification. The other is to check whether the specified choreography behaves in a correct way or whether the system functions well to complete its jobs. Having investigated the recent research results, we found that most of them addressed only one side rather than both.

To solve the problem above, we propose an approach of choreography model verification based on Description Logic, or CMV-DL. A service choreography modeling framework is provided to support UML-based modeling. The metamodel of service choreography extends the WS-CDL specification. Two algorithms are given to transform the UML-based service choreography model into the DL-based ontology through which the verification can be made using a DL reasoner.

To enable automatic verification, we introduce Description Logic (DL) [14] to formalize the choreography model and define a set of rules and constraints for verification. DL is a knowledge representation language and is a decidable subset of first-order predicate logic. It contains a set of basic concept constructors, such as concept consumption () and universal constraint (). As a subset of DL, SHOIN(D) [15] is the logical basis of web ontology language (OWL) [16]. It is powerful in knowledge description ability, reasoning decidability, and knowledge reusability and more importantly there are available supporting reasoners such Pellet [17] and Racer.

The rest of the paper is organized as follows. Section 2 introduces the service choreography modeling framework, highlighting the metamodel of service choreography. Section 3 discusses CMV-DL further on model verification mechanism and UML-DL conversion and introduces the deductive reasoning rules specified in Semantic Web Rule Language (SWRL) [18]. Section 4 provides a case study to show the potential usage of CMV-DL. Section 5 investigates the related work and draws a comparison between CMV-DL and other choreography verification methods. The final section brings a conclusion and foresees our future work.

2. Service Choreography Modeling Framework

According to the OMG’s four-layered metamodel architecture, UML can be extended by defining new stereotypes at metamodel level (M2) [19] and thus becomes a domain-specific language. The modeling framework for service choreography is accordingly defined in two levels, as shown in Figure 1. The metamodel of service choreography is built by extending the concepts of WS-CDL to provide the syntax and semantics for visually presenting a choreography and define a set of domain rules for model checking. The application model specifies the concepts of service choreography of an application (or a system) by instantiating the concepts of the metamodel.

2.1. Metamodel of Service Choreography

To build the metamodel of service choreography, we define the main concepts of service choreography by referring to the WS-CDL specification.

Definition 1 (session ()). A session is composed of a set of basic interactive activities which are performed by one or more participants in order to complete a specific function. Sessions can be denoted as a four-tuple structure: , where is the name, refers to the activities that realize this specific function, is a set of the roles that participate in the activities, and is a set of the preconditions for carrying out this session.

Definition 2 (choreography ()). A choreography defines the collaboration contracts between the participants and the interoperations of cross-system behaviors. It can be denoted as a three-tuple structure: , where is the name, is a set of the roles that participate in the interoperations, and represents a set of sessions to be executed in the interoperation processes.

Definition 3 (metamodel). The metamodel of service choreography defines the syntax and semantics for visually presenting a choreography and provides a set of domain rules for checking the correctness properties of a choreography model. It is composed of three parts: MetaConcept, MetaRelation, DomainRule, where MetaConcept and MetaRelation are sets of metaconcepts and metarelations which are defined by inheriting the counterpart concepts from the WS-CDL specification. The core elements of the metamodel are shown in Figure 2.

2.1.1. MetaConcept

MetaConcept is a finite set of metaconcepts that originate from WS-CDL but are not limited to it. New concepts, such as Session, AtomicSession, and CompoundSession, are extended for the purpose of formal modeling and verification. Table 1 provides detailed definitions and descriptions of the metaconcepts.

The concept activity can be divided into two kinds: BasicActivity () and StructuralActivity ():

is defined as

is used to describe a compound activity in following syntax and semantics:

The structure of a compound activity appears either in workunit pattern or in control-flow pattern. The workunit structure is defined in WS-CDL by three substructures. The condition structure is expressed as . The repeat structure is expressed as . A workunit structure is expressed as , which means that the activity will be blocked until the precondition is evaluated to be “true”; that is, the activity is triggered by the precondition. If terminates successfully and the repetition condition is “true,” the workunit will repeat; otherwise, it will finish. A control-flow structure is defined as any combination of sequentially executed activities , parallel executed activities , nondeterministically executed activities , or selectively executed activities .

The concept session can also be divided into two kinds: AtomicSession () and CompoundSession ():

An atomic session may appear in any form of the following:

describes that the role does not perform any operation; means that performs an internal silent action ; describes an assignment operation that the value is assigned to the variable ; describes a request interaction from to through , where the request message is sent from to ; describes a response interaction; describes request-response interaction; describes invoking operation between sessions; and NULL denotes the termination state of a session.

Two or more atomic sessions may be combined with one another by structural connectors to become a compound session . The combination follows the following syntax and semantics:

describes two parallel executed sessions; describes two sequentially executed sessions where is the predecessor and is the successor; describes two selectively executed sessions that either or will be executed depending on whether the precondition or is met; describes a repeatedly executed session, where is triggered by the precondition and it will be repeatedly executed until becomes “false.”

2.1.2. MetaRelation

MetaRelation is a finite set of semantic associations between the metaconcepts appearing in Table 1. It inherits the concepts of semantic verbs, such as Support, Perform, and Depend, from WS-CDL and is supplemented by some new relations such as SessionDeduction, Congruence, and Sequence, to enable comprehensive model checking.

Table 2 provides a set of core metarelations accompanied with their formal semantics specified in DL. The universal constraint is a DL symbol interpreted by , where is a set of metaconcepts, is a set of metarelations, and denotes nonempty set of discourse domains.

Definition 4 (SessionDeduction). SessionDeduction is defined by a labeled transition system: , meaning that the session will become in the future when the precondition becomes true and the atomic session of has been executed. SessionDeduction is defined for state reachability reasoning.

Definition 5 (congruence). If the two sessions and behave exactly in same way, there is a congruence relation between them, marked with the symbol . The conditions of congruence are listed as follows:

Definitions of MetaConcept and MetaRelation provide the syntax and semantics for modeling service choreography. In the subsequent sections, we will discuss the operational semantics for session evolution and the deductive domain rules for checking the properties of consistency, completeness, and state reachability of a service choreography model.

2.1.3. DomainRule

Domain rules are description of constraints for specific domains [20]. DomainRule is a set of rules defined upon the metamodel, providing overall constraints that need to be held by all concepts and relations in the service choreography model.

Domain rules of service choreography can be classified into three categories: consistency, completeness, and deductive reasoning. They are not limited to what we give in the following. They may be continuously enriched and improved as applied.

Definition 6 (consistency). A service choreography model is consistent provided that (1) the application model is built consistent with the metamodel and (2) there is not any conflict among the concepts of the application model. A typical set of consistency rules are defined below.(i): if there is a metarelation that associates the metaconcept with () in the metamodel and is instance of that associates the application concept with in the application model, then must be an instance of and must be an instance of .(ii): no more than one relation of Sequence, Parallel, Choice, or WorkUnit is allowed to associate a pair of sessions.(iii): if two sessions are associated with the relations and at the same time, the atomic session associated with by the Executing relation must be in type of , , or .(iv): if two sessions are associated with the relation (), must be an atomic session, that is, .(v): if two associated application concepts and are declared in the application model, there must be a metaconcept and a metarelation that satisfy and .(vi): the two relations and , are allowed to exist at the same time.

The WS-CDL specification includes many clauses stated with the keywords such as MUST/MUSTNOT, SHOULD/SHOULDNOT, and SHALL/SHALLNOT. Those clauses may relate to the consistency and completeness constraints. For example, “a Choreography MUST contain one or more sessions,” from which we can formally define a rule in DL. The DL expression , used for constraining a relation, is interpreted by (≥, where is a metarelation and is a metaconcept, and denotes nonempty set of discourse domains.

Definition 7 (completeness). A service choreography model is complete provided that the number of the relations in the application model satisfies the multiplicity constraint of the corresponding metarelation in the metamodel. A typical set of completeness rules are defined below, according to the WS-CDL specification:(i)(ii).(iii)(iv).(v)(vi)(vii)

Deductive reasoning rules relate to operational semantics that describes the evolution of a session. They are used to verify the state reachability of service choreography. Every deductive reasoning rule is a Horn clause in the following form: IF THEN , where the antecedent is a conjunction of one or more clauses and the consequent is an assertion of facts.

The relation SessionDeduction defines deduction of session evolution. If the precondition holds and the session executes an atomic session making become the session , then is deduced to . In particular, the two sessions can be identified by two states, before and after evolution.

Choreography model checking may concern two kinds of state. One is the session state that defines whether a session can normally end and successfully complete its job. The other is the application state that defines whether an application that is running on one or more sessions can normally end. As a session will never end until all atomic sessions end, the state space of a session is determined by all possible states of the atomic sessions. Similarly, the state space of an application is determined by all possible states of the sessions.

Definition 8 (state reachability). A state (session state) is reachable provided that there is an evolving session which can be identified by the state and which can be deduced from its ancestor in its initial state.

Inference 1. If the termination states of an application described by a choreography model are reachable, that is, the termination sessions (denoted as NULL) can be deduced from the initial sessions, then all states of the application are reachable and the choreography models are thereby proven correct.

The basic deductive reasoning rules are listed as follows:










is the rule that holds the indivisibility of an atomic session. is the rule that keeps the congruence in session deduction. and are the deduction rules for the sequence sessions. is the deduction rule for parallel session that two sessions are parallel executed and one of them evolves. is the deduction rule for the choice session that if the precondition is true and the atomic session is executed, one of the two sessions will evolve. is the deduction rule for the workunit that if the precondition is false and works in nonblocking mode, the workunit will be skipped. is the deduction rule for the workunit that if is “true” and is “false will become and the workunit will become after execution of . is the deduction rule for the workunit that if both and are “true will become and the workunit will become after execution of , where means if is “true” then will be iteratively executed.

2.2. Application Model of Service Choreography

The metamodel gives a formal definition of the metaconcepts and relations of service choreography and provides fundamental semantics for service choreography description. An application model, an instantiation of the metaconcept model, gives a UML-compliant description and representation of service choreography. To build the model, software engineers would start with analysis of the objectives to be achieved in the choreography and then describe the roles that participants implement, the activities that the roles perform, and the session execution patterns.

Definition 9 (application model of service choreography). The application model of service choreography formally describes the design of service choreography for a distributed application in a UML-compliant representation within the constraint of the metamodel. It comprises three parts: AppConcept, AppRelation, AppFunction.

AppConcept is a finite set of application concepts which are defined by instantiating the metaconcepts of the metamodel. AppRelation is a finite set of application relations which are defined by instantiating the metarelations of the metamodel. AppFunction is a set of functions that map AppConcept to MetaConcept or AppRelation to MetaRelation in order to trace the types of application concepts and relations in the metamodel. For example, Buyer is an AppConcept and Role is the corresponding MetaConcept, and thus the function is built as follows: AppFunction(Buyer) = Role.

The application model, if built with a universal UML tool, may suffer from the problems of inconsistency, incompleteness, and unreachable states, as mentioned in Section 2.1.3. The next section will discuss the technique of transforming the UML-based model into DL ontology to verify the correctness of the model with the help of a popular reasoner.

3. Model Transformation and Verification

UML is a semiformal specification language and it does not by itself support logic inference for model checking. A popular solution is to use OCL (Object Constraint Language), a subset of UML for defining domain constraints in first-order predicate logic for model checking. Unfortunately, it is well known that the full expressiveness of OCL may lead to undecidability of reasoning [21]. As a result, the existing methods have to either limit the UML/OCL constructs or decrease the level of automation or balance between the two.

To solve the problem, we suggest using SHOIN(D), the subsystem of DL, to formalize the models. DL is proven powerful in expressibility and decidability for knowledge engineering and allows for making use of some handy reasoning engines, such as Pellet and Racer. But engineers may worry about the fact that it is hard to learn a formal language, hoping that the formal language would be hidden by a software tool.

This section will discuss the algorithms for model transformation, the mechanism of model verification based on SHOIN(D), and the way of implementing prototype.

3.1. Mechanism for Service Choreography Model Verification Based on SHOIN(D)

In the early stage of our research, we obtained some meaningful achievements in DL-based reasoning. Dong et al. [22, 23] checked the C4ISR (Command, Control, Communication, Computer, Intelligence, Surveillance, and Reconnaissance) domain models to guarantee the consistency and completeness through converting the UML models into the Description Logic ontology and making use of inference engine Pellet. He et al. [24] presented a method of UML behavioral model verification based on Description Logic system. He et al. transformed UML behavioral models to OWL DL ontology, and hence model consistency can be verified with DL supporting reasoner. Zhang [25] tried to transform the service-oriented application models to OWL DL ontology. Based on the above research, we present, as shown in Figure 3, a mechanism for service choreography model verification based on SHOIN(D).

The principle of conversion and verification is as follows: (1) convert the metaconcepts of service choreography model and domain rules into the axiom sets in SHOIN(D) Tbox and the application concept model into the assertion sets in SHOIN(D) Abox; (2) verify the consistency, completeness, and state reachability with the help of a reasoning engine like Pellet that supports logical reasoning through the SHOIN(D) ontology. The automatic conversion can be realized by Algorithms 1 and 2.

Input: the meta model and the application model
Output: Tbox, Abox
(1)  Tbox = , Abox = ;
(2)for all  MetaConcepts   in meta concept model, do
(3)     Tbox = ;
(4)   if   is the superclass of , then
   Tbox = ;
(5)   else if   has a data attribute (data type is ), then
   Tbox = ;
(6)   else if   and Tbox
       and Tbox, then
   Tbox = ;
(7)   end if;
(8)end for;
(9) for all  MetaRelation   in meta concept model, the relation is the inverse relation of , do
   Tbox = ;
(10)   end for;
(11)    for all the multiplicity constraints in range and domain of every relation in meta concept model,
   the relation is the inverse relation of , do
(12) if multiplicity constraints is “then
   Tbox = ;
(13) else if multiplicity constraints is “then
   Tbox = ;
(14) else if multiplicity constraints is “1” then
   Tbox = ;
(15) end if;
(16)   end for;
(17) return Tbox;
(18)   for all individuals in application concept model,
(19)  if  AppFunction(o) = c, do
   Abox = ;
(20)  end for;
(21)   for all individuals , in application concept model,
(22)     if  AppFunction(r) = R, do
   Abox = ;
(23)    end for;
(24) return Abox;
Input: Deductive reasoning rules for service choreography model
Output: Deductive reasoning rules based on SWRL,
Antecedent = , Consequent = , Clause = ;
for all concepts and relations in prerequisite and conclusion,
if  ,   has relation, , are individuals of , , then
  Clause = ;
else if   has a attribute (type is ), then
   Clause = ;
end for;
for all clauses cl do
if relation between cl is “and” in prerequisite then
  Antecedent = ;
else if relation between cl is “or” in prerequisite then
  Antecedent = ;
else if relation between cl is “and” in conclusion then
  Consequent = ;
else if relation between cl is “or” in conclusion then
  Consequent = ;
end if;
end for;
   = Antecedent Consequent;
return  ;

Step (1) initializes Tbox and Abox. Steps (2)–(10) convert the metaconcepts and relations of the metaconcept model into axiom sets in Tbox. Steps (11)–(17) add multiplicity constraints of metarelations as axiom sets in Tbox. Subsequently, steps (18)–(24) convert application concepts and relations of the application model into assertion sets in Abox. For a more detailed explanation for each step, refer to our previous work in [22, 24]. Our previous work [23] showed that the conversion is correct and there is no semantics loss.

The domain rules need to be transformed into axiom sets in Tbox. The consistency and completeness rules are specified in DL and can be added to Tbox through steps (11)–(17), while the deductive reasoning rules need to be specified in SWRL. SWRL combines RuleML and OWL DL. It is popularly applied for formal representation of semantic rules and knowledge-based reasoning, supported by the algorithm Tableau [15].

Algorithm 2 is provided to convert the deductive reasoning rules in the choreography model into the formally specified rules in SWRL.

The concepts and relations appearing in the prerequisite of the deductive reasoning rules are converted to Horn clauses that combine to form the antecedent of SWRL rules, and the concepts and relations appearing in the conclusion are converted to Horn clauses that combine to form the consequent of SWRL rules.

For example, the deductive reasoning rules and are converted into following SWRL expressions:

3.2. Prototype Implementation

Algorithms 1 and 2 have been realized and integrated into our requirement analysis tool, the so-called ontology-based requirements elicitation and analysis tool (OBREAT) [25]. The architecture of OBREAT and the major components can be found in Figure 4.

The presentation layer handles the interaction between users and the application. It checks and takes user input, as well as feedbacks on execution information including reasoning results. Currently, there are a variety of modeling notations, such as UML [2631], Message Sequence Charts (MSCs) [32], and BPMN [33, 34], to visually model the choreography interaction. In our research, the UML class diagrams and collaboration diagrams (or Communication Diagrams in [35]) are preferred for the presentation of service choreography models from structural and behavioral viewpoint, respectively, taking the advantages of rigorous syntax and semantics, easiness to interpret, and various tools such as RSA, EA, and Rose to share some part of models.

As the core component of the architecture, the logic layer processes modeling transactions and submits results to the presentation layer, and it also communicates with the data layer for data persistency. The DL-based Formal Model Generator handles the model conversion with the algorithms mentioned before. The output of the generator is sent to the Model Feature Reasoner and then is saved in Formal Model database. The Model Feature Reasoner communicates with DL reasoner Pellet to complete DL-based reasoning and sends the results to the Execution Control Interface. At present, the Model Feature Reasoner realizes checking the property of consistency, completeness, and state reachability. Specifically, the results are also exported to the Domain Knowledge base to enrich domain knowledge for reuse and thus knowledge reusability is continuously enhanced with knowledge accumulation.

The data layer handles data persistency and maintenance. Most of the input/output data are saved as files, either model files or formalized files, while model management data are saved in the database System Administration Database.

For the sake of model exchange, we adopt XML as the model data description language. Since the DL-based reasoner accepts data only in OWL which is somewhat different from XML, the conversion between two types of XML documents is needed. To automate the conversion, we choose the eXtensible Stylesheet Language Transformation (XSLT) technology [36, 37] which is widely used for XML document conversion. We designed a set of XSLT transformation templates compliant with the Ontology Definition Metamodel (ODM) and the UML profile. Moreover, we integrate MagicDraw [38] in OBREAT as the GUI modeling tool in the presentation layer and Pellet 1.5.0 as the DL-based reasoner in the logic layer for verification reasoning.

4. Case Study

In this section, a simplified case of purchase order application of the e-commerce system is studied, illustrating how to construct the application model and how to realize the model verification.

4.1. Construction of Application Model

Limited by the page size, the choreography of the purchase order application is simplified here, and it covers only a few main activities: the buyer sends an order request to the seller, and then the seller checks the buyer’s credit record in the bank and the supplier inventory. If the credit record is good and the inventory is sufficient, then the order will be accepted; otherwise, it will be rejected.

The above activities are encapsulated into five sessions: purchase order request (), credit check (), inventory check (), purchase order response (), and purchase order reject (), where there are four participant roles: Buyer, Seller, Bank, and Supplier.

The buyer initiates an interaction with the seller by placing a purchase order po through the channel Ch@Seller, and then the seller acknowledges the buyer by sending him a poAck. Meanwhile, the states of the purchase orders of Buyer and Seller are set to “sent” and “received,” respectively. Accordingly, the session is composed of four atomic sessions which are specified as follows and is modeled as a UML class diagram in Figure 5:

The other four sessions function as follows: checks the buyer’s credit by sending a request to the bank that sends back the state of the credit (good or bad); checks the seller’s inventory by sending a request to the seller who sends back the state of the inventory (sufficient or short); declares that the order has been accepted if the buyer’s credit is good and the inventory is sufficient; and declares that the order has been rejected if the buyer’s credit is bad or the inventory is short. The four sessions are specified as follows, while the related class models are omitted due to space limitation:

The interaction sequence companied with the participants and key messages can be modeled as a UML communication diagram, as shown in Figure 6.

Having analyzed the interaction and the functions of the sessions, we can organize the choreography of the purchase order application as follows: and specify the preconditions of the sessions as follows:

Furthermore, the SHOIN(D) ontology can be generated by the algorithm Construct Tbox&Abox, and as a result, the axiom set in Tbox and the instance set in Abox are listed as in Table 3.

4.2. Verification of Application Model

Having constructed the application model which is subsequently converted into the SHOIN(D) ontology, we can check the correctness properties of consistency, completeness, and state reachability of the specified choreography separately in the following cases.

Case 1 (consistency checking). Consistency checking is to check whether there is a conceptual conflict against modeling semantics; that is, giving a group of related concepts in the application model, the relationships defined between the concepts should semantically abide by the metarelation declared in the metamodel.

Suppose that the modeler tries to add in the application model a relation of Executing between the Role Buyer and the session S_poReq. But according to the constraints of the metamodel, such relation should only appear between two sessions and thus the consistency rule is broken. The error can be easily found through ontology consistency checking using Pellet.

Case 2 (completeness checking). The completeness checking is to check whether there is a lack of concept or relation in the application model; that is, giving a group of related concepts, the relationships defined between the concepts should quantitatively abide by the multiplicity constraint of the corresponding metarelation in the metamodel.

Suppose that the modeler tries to build a Belong_to relation between the Role Supplier and the Participant ProductTrader, meaning that Supplier belongs to a product trader, but wrongly add at the same time another Belong_to relation between Supplier and ServiceProvider. Such relation violates the multiplicity constraint that means each Role must belong to only one Participant at any time. The checking result is prompted by Pellet as follows: “Consistent: No Reason: The individual Supplier has more than one value for property Belong_to violating the cardinality restriction.”

Case 3 (state reachability checking). The state reachability checking is to examine whether the termination state of a session defined in the application model is reachable; that is, the session can be deduced using the deductive reasoning rules given in Section 2.1.3. If all sessions are deducible, implying that all session states are reachable, the application behavior is verified.

Session S_poReq is deduced with the deductive reasoning rules and and as a result, poState of Buyer will be set to “sent” and the poState of Seller will be set to “received.” The reasoning process is shown as follows:

The other four sessions credit check (), inventory check (), purchase order response (), and purchase order reject () can be deduced in the same way.

The whole reasoning process for the application choreography model is shown as follows, where the deduction rules applied to each step are bracketed with :

The atomic sessions in each step are as follows:

The application behavior is proven correct provided that the reasoning process may eventually end by NULL. Otherwise, there must be a session failed to be deduced (i.e., its termination state is not reachable). If the application may reach the final state, the key attribute poState of both Buyer and Seller will be set to either “completed” or “uncompleted” (this may happen when or ).

To give a negative example, we deliberately assign Seller.poState = “” to make false the precondition P_credChe. As a result, the reasoning process rests on the second step, as the session S_credChe cannot be deduced, and therefore all states of the sessions followed cannot be reached. The exception can be found by entering the following DL-based query command in the form of SPARQL [24]:SELECT ? a WHERE a rdf:type xmlns: Session.? a xmlns: SessionDeduction ? .

Figure 7 shows the query result in the human interface panel of the reasoner Pellet, indicating that the session S_poReq and its four atomic sessions are deduced (or can be successfully executed), while all of the subsequent sessions are not shown (or the corresponding states of these sessions may not be reached). The rules are applied to the reasoning process to find the exception where the SessionDeduction relation is broken between the sessions S_poReq and S_credChe and all of the followed sessions cannot be deduced.

Beside the above experiment, we have also modeled and verified several other cases, including the online shopping example given in [39], the buyer-seller example given in [40], and the example from the supply chain management [41].

The verification is processed efficiently. The reasoning for each case is finished within a second, tested on a laptop computer with a 2 GHz Intel processor and 1 GB of RAM. That accords with the comments by Haarslev and Möller [42] who pointed out that even for the hardest problems the query time for DL reasoning would be within three seconds. But, is it always true?

In order to evaluate the efficiency of reasoning for CML-DL models, we choose the four popular ontologies, VICODI, LUBM, Semintec, and Wine which are used in previous benchmarks, carry out experiment, and lead a statistical analysis by comparing the average response times against the increasing size of Abox. The detailed descriptions about these ontologies can be found in [43].

The ontologies represent four standard benchmark datasets with different complexity. The VICODI ontology is relatively small and simple since it does not contain any disjunctions, existential quantification, or number restrictions. The LUBM ontology and the Semintec ontology are also relatively simple like the VICODI ontology, while the Semintec ontology is more complex since it contains functional properties and disjointness constraints and is constructed using OWL DL. The Wine ontology is the most complex one. It contains a classification of wines and is established using advanced DL constructors.

The above ontologies are too small in Abox for our intended performance evaluations. Therefore, we generate different sizes of datasets, ranging from 100 to 1 million, to increase the Aboxes of the ontologies for test.

To evaluate the query times for various sizes of the datasets, we use several query patterns covering all cases from the simplest one that retrieves all individuals of one concept to the most complex one that retrieves all individuals of the concept. The test method can be referenced by [44].

Figure 8 shows the average time for query response for the four ontologies with the generated datasets sizing from 100 to 1 Mio. All the results reported are averaged over 100 queries. The test is led on a laptop computer with one 2 GHz Intel processor and 1 GB of RAM, running Windows XP Service Pack 4.

As expected, there is a significant increase of query time as the size of the ontology increases from the 100 to 1 million. For the example of the Semintec ontology, the average query time for smaller size from 100 to 10,000 is less than a second (<1). But, the time increases sharply as the size exceeds 10,000, and for the maximum size of 1 Mio., the time is about 800 seconds or about 17 minutes.

Our ontology of CML-DL may match the Semintec ontology since the domain of the CML-DL application is not more complex than that of the financial services, and the model can be constructed using a fragment of OWL DL without advanced DL constructors. Therefore, we expect that the number of individuals of the CML-DL application model, for most projects, would be less than 100000, and thus the reasoning time would be within a few minutes at worst.

System verification accounts for a larger proportion in the domain of system engineering. There are two categories: testing and formal verification. Testing is the process in which testers input the use case through access point and then observe the corresponding output to discover potential error. Formal verification, which can be further divided into model checking based on exhausting searching and model verification based on logic reasoning, expresses system specification by using mathematical methods and then proves whether the designed system meets the desired properties according to mathematical theory.

The initial progress of testing static system properties converts static properties into hard-code and calls these codes during program compilation. Because system specification and system implementation are bounded together, the method does not have flexibility and reusability; that is, when the system static properties to be checked change, the hard-code must be updated as well [45].

The emergence of the open-source projects WS-CDL Eclipse Plugin [46] and Pi4SOA [47] makes it possible to separate system specification and implementation. The engines of these projects provide an execution and simulation environment for WS-CDL documents. When the properties to be checked change, the corresponding descriptions can be revised without any modification to the inspection program. Reference [48] notes that although the WS-CDL is just a description language, it is essentially an XML document; therefore, Eclipse Plugin simulators regard the WS-CDL as executable script language for interpretation and implementation. Nevertheless, the verification methods based on the simulation engine need to traverse the entire value space when handling existential and universal quantifiers, leading to a higher complexity of time and space.

Workflow based web service composition and formal verification techniques are broadly applied. They can be divided into three categories: PN- (Petri Net-) based, FSA- (Finite State Automate-) based, and PA- (process algebra-) based techniques.

Xia et al. [49, 50] achieved the complete conversion from a WS-CDL document to stochastic Petri Nets; moreover, they make assessments of time expectations, probability, and cost expectations of services that normally end due to the index of performance, reliability, and execution cost, respectively. Foster et al. [51] proposed a type of service modeling and verification method called model-based service compositions engineering for the first time. The method maps the service composition model to a finite state process (FSP), aiming at verifying the compatibility between composite services and their environment by using a labeled transition system analyzer (LTSA). Molina-Jimenez and Shrivastava [52] developed the concept of conformance between a contract and a choreography by assuming that they can be modeled by Finite Automaton (FA). The choreography specifications and contracts, specified by the BPMN notation and the event-condition-action rules, describe permissible interactions between partners from different viewpoints. They established a process to automatically check whether all the behaviors permissible in a choreography are also permissible in the corresponding contract and vice versa. Díaz and Llana [53] mapped the service composition model in WS-CDL to a timed automation and then simulated the dynamic behavior of the system using the automatic verification tool UPPAAL. The literature [54] provided the basis for formalizing and reasoning on the mobility characteristics of web services choreography using the process algebra -calculus which is, according to Robin Milner, a model of concurrent computation based on the notion of naming. They argued that the process algebras, such as -calculus, can be used to formalize web services characteristics to ensure that they satisfy some conditions required in SOA. Salaün et al. [55] presented a method of encoding the collaboration diagrams into the LOTOS process algebra. This encoding allows checking realizability of the collaboration diagrams for both synchronous communication and bounded asynchronous communication.

Gu et al. [8, 9] presented a formal modeling framework Abstract WS-CDL with grammar, congruence relations, and operational semantics. They defined a set of mappings of the Abstract WS-CDL global model to the Pi calculus-based local model and accordingly suggested a set of deductive reasoning rules of state reachability and terminability. Unfortunately, they failed to provide the consistency and completeness verification mechanism as well as the corresponding reasoning engine.

Temporal logic (TL) can be used to assert the behavior change with time evolution. The model verification techniques based on TL model the finite states of systems with Promela and specify the system properties with TL expressions to enable verification by checking whether the intersection between the Promela models and the TL expressions is empty. Zhang and Liu [56] presented a formal verification method for CCML (Cooperative Composition Modeling Language) based web service composition. They build a mapping of CCML description to CCS expression and approach property verification and service compatibility verification with the help of a TL based checking mechanism and an automated tool.

The above verification methods, either based on workflow or based on TL, may lead an exhaustive search for all states of system execution, which may cause the state explosion problem when the number of the states exponentially increases as the system scales up.

The DL-based logic reasoning technique is applied to service modeling and verification. Liu et al. [57] proposed a service modeling and composition method based on DL rules using a uniform way of characterizing the static semantic and dynamic interaction characteristics of web services. The method unifies the service composition model within the framework of DL rules to remedy the defect that DL cannot describe the dynamic characteristics of web services. Chang et al. [58, 59] proposed an Extended Dynamic Description Logic EDDL(X) by extending the traditional Dynamic Description Logic (DDL) to transform the web service composition model into the EDDL(X)-based one. However, the above methods focus more on service discovery, service matching, and service composition while addressing less the property verification of service choreography model.

Over the past decade, many organizations and individuals have conducted in-depth and fruitful research in the field of service choreography model verification. Their achievements can be categorized, according to their underlying methods, into hard-coding, simulation engine, workflow, temporal logic, and Abstract WS-CDL. As shown in Table 4, we have made a primary comparison of our method with those techniques by the important features such as description ability, degree of automation, knowledge reusability, efficiency of verification, and state explosion.

Regarding description ability, the methods of hard-coding and the simulation engine rank highest since they allow compiling the XML scripts of WS-CDL models. CMV-DL ranks second high due to the strong ability of induction and high level of abstraction with DL [60]. The methods based on workflow and TL are weak in description ability, because they cannot ensure the completeness of mapping one form of model to another, especially when the model is complex in its structure. The current methods based on PA, for example, provide only a few of simple mapping rules and thus do not guarantee a strict conversion, which may cause loss of semantics during conversion [61].

Automation is an important feature for model verification. Each of the methods listed has related supporting tools, except for Abstract WS-CDL. CMV-DL is graded comparatively high in automation, because it takes advantage of available reasoning engines such as Pellet which are based on the mature algorithm Tableau and which have been integrated into the requirement analysis tool OBREAT.

Regarding knowledge reusability, the workflow based, the TL based, and the other three techniques do not provide the mechanism for knowledge accumulation and reuse, while OBREAT allows enriching domain knowledge by adding new reasoning rules to the Domain Knowledge base and importing the reasoning results back into the knowledge base for further reasoning. But the process relies on human interference and thus the capability is graded low.

The verification efficiency is determined by both human interference and time consumption in validation process. The hard-code method and CMV-DL depend heavily on human interference and therefore are graded lowest. The simulation engine technique needs to traverse the entire value space when handling existential and universal quantifiers, which leads to a higher complexity and therefore a low efficiency. The other three methods, supported by automated tools, work highly efficiently for verification.

When a system has many concurrent components, the state space of system execution might expand infinitely beyond ordinary computation capability. Based on exhaustive search, workflow and TL may suffer from the problem of state explosion which has long been the bottleneck for their development [62]. In contrast, CMV-DL works on deduction reasoning where the space of deduction reasoning is limited by the size of the generated ontology and the computation complexity is decreased by the high efficient reasoner algorithm Tableau [63, 64].

In general, CMV-DL has an obvious advantage in knowledge reusability and is free of state explosion without the loss of description ability, and it has a modest degree of automation and efficiency compared to the other methods.

6. Conclusion and Future Work

This paper focuses on service choreography modeling and verification. It proposes a new approach of choreography model verification based on Description Logic to verify the service choreography model based on SHOIN(D) of DL. The main contributions are as follows:(1)A metaconcept model of service choreography based on WS-CDL is proposed, providing a framework to formally define a service choreography model.(2)Domain rules of consistency, completeness, and deductive reasoning are defined, enabling formal verification of the service choreography model.(3)The model transformation algorithms are provided and thus the service choreography model and domain rules can be converted into the DL ontology and model verification can be thereby made automatically with an available reasoner, such as Pellet.(4)The related work on service choreography verification is investigated. The representative methods are analytically compared with CMV-DL on such features as description ability, degree of automation, knowledge reusability, efficiency of verification, and state explosion.

Currently, CMV-DL focuses only on a few key issues of web service choreography. The goal of the metaconcept model and CMV-DL is to verify several key properties of service choreography models. Therefore, CMV-DL captures only a core set of WS-CDL features while omitting some advanced features, such as exception and finalizing blocks. Future work involves extending the service choreography metaconcept model to cover more properties of WS-CDL, such as the consistency between choreography models and orchestration models. Moreover, the domain rules need to be enriched and improved as the approach is applied to more projects.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


The paper work was supported by National Natural Science Foundation of China under Program no. 61273210.