Journal of Applied Mathematics

Volume 2013, Article ID 964682, 13 pages

http://dx.doi.org/10.1155/2013/964682

## A Unified Framework for DPLL(T) + Certificates

^{1}Tsinghua National Laboratory for Information Science and Technology (TNList), Beijing 100084, China^{2}School of Software, Tsinghua University, Beijing 100084, China^{3}Key Laboratory for Information System Security, MOE, Beijing 100084, China^{4}Department of Computer Science and Technologies, Tsinghua University, Beijing 100084, China^{5}Institute of Information Science, Academia Sinica, Taipei 115, Taiwan

Received 6 February 2013; Accepted 8 April 2013

Academic Editor: Xiaoyu Song

Copyright © 2013 Min Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Satisfiability Modulo Theories (SMT) techniques are widely used nowadays. SMT solvers are typically used as verification backends. When an SMT solver is invoked, it is quite important to ensure the correctness of its results. To address this problem, we propose a unified certificate framework based on DPLL(T), including a uniform certificate format, a unified certificate generation procedure, and a unified certificate checking procedure. The certificate format is shown to be simple, clean, and extensible to different background theories. The certificate generation procedure is well adapted to most DPLL(T)-based SMT solvers. The soundness and completeness for DPLL(T) + certificates were established. The certificate checking procedure is straightforward and efficient. Experimental results show that the overhead for certificates generation is only 10%, which outperforms other methods, and the certificate checking procedure is quite time saving.

#### 1. Introduction

##### 1.1. Background and Motivation

Satisfiability Modulo Theories (SMT) techniques are getting increasingly popular in many verification applications. An SMT solver takes as input an arbitrary formula given in a specific fragment of first order logic with interpreted symbols. It returns a judgment telling whether the formula is satisfiable. By satisfiable, it means that there exists at least one interpretation under which the formula evaluates to true.

SMT solvers are typically used as verification backends. During a verification process, SMT solver may be invoked many times. Each time the SMT solver is given a complex logical formula and is expected to give a correct judgment. However, SMT solvers employ many sophisticated data structures and tricky algorithms, which make their implementations prone to error. For instance, there are some solvers disqualified in SMT-COMP due to their incorrect judgments [1]. We therefore strongly believe that SMT solvers should be supported with a formal assessment of their correctness.

Ensuring the correctness of an SMT solver is never easy. Generally, there are two approaches: to verify the SMT solver or to certify their judgments. For the former approach, it proves directly that the solver is correct. So, its judgements are naturally correct. Though rigor in logic, this method is not generally practical because verifying an SMT solver requires great workload. Not only the algorithm but also the implementation should be verified in a formal way. To the best of our knowledge, no state-of-the-art SMT solver has been completely verified. More importantly, this approach is solver dependent. Even if a solver is completely verified, the whole verification has to be done again when the solver is modified.

The rationale of the latter approach is as follows: instead of proving the correctness of the solver, we prove its judgment. For this purpose, the solver is required to return a piece of evidence to support its judgment. This evidence is called *certificate* in this paper. Then, the correctness for the judgment is ensured by the certificate. If so, a rigorous and detailed proof can be constructed by referring to the certificate. Obviously, checking a certificate is much easier than verifying a solver. If the certificates are presented in a proper format, its checking algorithm could have low complexity. For instance, linear time is required to check certain certificates for unsatisfiable SAT problems. The certificate approach is solver independent. We need not to look into the solvers; the only requirement for solvers is to generate certificates. If the certificate format is unified, the certificate checking can be a uniform procedure. This benefit is vital since we need not to develop a certificate checker for each SMT solver.

There are some SMT solvers that support certificate generation, such as CVC3 [2] and Z3 [3]. However, their certificate formats are quite different. As a result, the generation and checking procedures are tool specific. Although SMT solvers differ in the algorithms, we believe that their formats of certificates could be unified. The basic idea comes from SAT. In an SAT community, people consider to generate certificates along with the DPLL framework, and the certificates (for unsatisfiability instances) are unified as chains of linear regular resolutions which can lead from the initial clauses to an empty clause [4, 5]. With this uniform format, the generation and checking procedures for certificates can also be unified.

This paper considers the DPLL(T) framework, which is extended from DPLL and has been widely used in state-of-the-art SMT solvers. A uniform format of SMT certificate is defined in this paper. The road map of our approach is shown in Figure 1. Note that the size of certificate can be exponential or even larger with respect to the input problem [6]. To reduce storage space, the certificate format is defined carefully in a concise way. Upon this format, a unified procedure for certificates generation is proposed. This procedure can be easily integrated into DPLL(T)-based SMT solvers. Moreover, a certificate checking procedure is also proposed in this paper. The checking procedure is easy, fast, and memory saving. To make it extensible, theory lemmas are checked by individual lemma checkers. Experimental results show that the average overhead for the certificates generation is about 10%.

##### 1.2. Related Work

Propositional satisfiability problem (SAT) is a well-known decision problem. It was proven to be NP-complete by Cook [7] in 1971. All known algorithms for SAT have exponential complexity. State-of-the-art SAT solvers are based on DPLL [8] algorithm. The basic idea is to explore all valuations by depth-first search in a branch and bound manner. To reduce unnecessary search efforts, an optimization called *clause learning* is used, which learns a clause whenever it recovers from a conflict. This optimization significantly increases the performance and makes SAT algorithms practical. During the clause learning process, each learned clause is a resolvent of linear regular resolution from existing clauses [4].

SMT is an extension of SAT. As the name implies, the input formula may contain theory-specific predicates and functions. For instance, while only boolean variables are allowed in SAT, is a well-formed SMT formula. The background theory T of SMT is usually a composed theory from several individual theories. In this paper, we focus on quantifier-free SMT problems, for which the popular algorithm is DPLL(T) [9]. The DPLL(T) algorithm is extended from DPLL. Since the DPLL(T) algorithm is described as a transition system, solving an SMT problem is actually seeking a path to the *success* or *fail* state while respecting the guard conditions on transitions. Although there are other heuristics and optimizations, currently it is almost standard to develop SMT solvers based on DPLL(T).

Certificates are, informally, a set of evidence that can be used to construct a rigorous proof for the judgment. For satisfiable cases, the certificate could be a model. What is more interesting is the certificates for unsatisfiable cases. Deduction steps from the input problem to a conflict should be reproducible by referring to the certificate. In this paper, when saying certificates, we refer to those for unsatisfiable cases.

Many of the state-of-the-art SAT solvers can generate certificates, but their logical formats are very distinct from each other. For example, the certificate generated by zChaff [10] and MiniSat [11] are quite different. PicoSAT [12] generates concise and well-defined certificates. A tool called TraceCheck can do certificate checking [12]. In PicoSAT's certificates, there are initial clauses and resolvents. Resolvents are obtained from linear regular resolution. If all the clauses form a directional acyclic graph and the empty clause is derived, then it implies that the initial clause set is unsatisfiable. As shown in [4, 5], clause learning in DPLL corresponds to a linear regular resolution. So, this certificate format is also adoptable by other DPLL-based solvers.

For general SMT problems, quantifiers may be used. Furthermore, theory-specific inference rules should be considered when deciding their satisfiability. Various logical frameworks are developed to provide flexible languages to write their proof, such reference could be found in [13–16]. If we focus on quantifier-free problems, the proof format and proof checking could be simplified. In practice, solvers such as CVC3 and Z3 can generate certificates, but their formats are also quite different because they use different proof rules. Proof rules are a set of inference rules with respect to the background theory T. To be better suited for certificate generation and checking, the certificate format should not depend heavily on theory-specific proof rules.

The overhead for generating certificates should be as small as possible. An approach with 30% overhead is presented in [17], which uses the Edinburgh Logical Framework with Side Condition (LFSC). In our approach, we focus on quantifier-free problems so that the overhead is further reduced. Certificate generation is related to unsatisfiable core computation. In [18], a simple and flexible way of computing small unsatisfiable core in SMT was proposed. They compute by the propositional abstraction of SMT problem and invoke an existing unsatisfiable core extractor for SAT. Although the idea might be similar, unsatisfiable core computation concerns which clauses lead to unsatisfiability, whereas certificate generation concerns how. It is convenient to generate unsatisfiable core from certificates, but not vice versa.

To check certificates, people can either check the certificate directly, just as in the SAT case, or translate them into inputs for some theorem provers such as HOL Light [19] or Isabelle/HOL [20]. There is also research in checking certificates using simple term rewriting [21]. A known fact is that if certificates are checked by translating them to proof items for theorem provers, it could take much longer time than certificate finding [20].

##### 1.3. Contribution

In this paper, we present a certificate framework for quantifier-free formulas. Compared to other works, our approach has the following advantages. (i)The certificate format is simple, clean, and extensible to different underlying theories. (ii)The certificate generation procedure is well adapted to most DPLL(T)-based SMT solvers. We also implement it in our solver. (iii)On average, the certificate generation induces about 10% overhead which is much less than other approaches. (iv)The certificate checking procedure is simplified. It has better performance.

The rest of this paper is organized as follows. In Section 2, we recall the original DPLL(T) algorithm. The certificate format is defined in Section 3. The framework of DPLL(T) + certificate is described in Section 4, where the formal rules are defined in Section 4.1, necessary implementation issues are discussed in Section 4.2, the properties are discussed in Section 4.3, and a couple of examples are shown in Section 4.4. Section 5 discusses the certificate checking matters. Experiments’ results are shown in Section 6. Finally, Section 7 concludes the paper.

#### 2. The DPLL(T) Algorithm

The DPLL(T) algorithm was proposed in [9]. The input is a quantifier-free first order formula in conjunctive normal form (CNF) from the background theory T. DPLL(T) was given as a transition system. Most features and optimizations of SMT algorithms can be formalized in that way.

In first order logic, any CNF formula can be viewed as a set of clauses. Each clause is disjunction of literals, and each literal is a positive or negative form of an atom. A formula is satisfiable if there exists an interpretation under which the formula evaluates to true.

In DPLL(T), formula satisfiability is considered in some background theory T (all interpreted symbols should be from T). We use “” to denote theory entailment in T. Given sets of formulas and , if any interpretation that satisfies all formulas in also satisfies all formulas , we write . Similarly, “” denotes propositional entailment which consider each atomic formula syntactically as a propositional literal.

Each state is a 2-tuple “” where is a stack of the currently asserted literals and the set of clauses to be satisfied. The transition relations are given by the following rules.

(i) Decide:

*Precondition*(1) or occurs in a clause of ;(2) is undefined in ,

(ii) UnitPropagate:

*Precondition*(1);(2) is undefined in ,

(iii) TheoryPropagate:

*Precondition*(1);(2) or occurs in ;(3) is undefined in ,

(iv) T-Backjump:

*Precondition*(1) ; (2)there is some clause such that:(a) and ;(b) is undefined in ;(c) or occurs in or in ,

(v) T-Learn:

*Precondition*(1)each atom of occurs in or in ; (2),

(vi) T-Forget:

*Precondition* ,

(vii) Restart:

*Precondition* T

(viii) Fail:

*Precondition*(1); (2) contains no decision literals,

In DPLL(T) algorithm, there is an assumption that there exists a T-solver that can check the consistency of conjunctions of literals given in T. This algorithm will terminate under weak assumption and is proven sound and complete [9].

#### 3. The Certificate Format

If a formula is satisfiable, the certificate is simply an interpretation under which the formula evaluates to true. On the other hand, if it is unsatisfiable, the certificate could be much more complicated. It should be proven that “all attempts to find a model failed.” More precisely, each branch of the search tree must be tried. Technically speaking, this *closed search tree* can be presented as a refutation procedure which contains sequences of resolution procedures that produce the empty clause [12]. It is imaginable that if the search tree is quite large so is the set of refutation clauses.

Remember that in first order logic (FOL), constants are considered as nullary functions. Then, a term is either a variable or a function with arguments which are also terms. For example, , and are valid terms. An atom (or atomic formula) is a predicate with arguments being terms. Note that nullary predicates are actually propositional variables.

*Definition 1 (T-atom, T-literal). *Given a background theory T, a T-atom is an atom whose functions and predicates are in the signature of T. A T-literal is the positive or negative form of a T-atom.

For example, given T the theory of *Equality with Uninterpreted Functions*, if is a propositional variable, then , are T-atoms and , , and are T-literals.

A proof rule is an inference rule which has several T-literals as premises and one T-literal as the conclusion. For proof rules whose conclusion is conjunction of T-literals, we can split it to proof rules and one for each.) For example, implies that is a proof rule (the monotonicity of equality). This rule holds for all T-terms and all function . Each proof rule can be instantiated to a theory lemma. For instance, if are two specific T-terms and is a specific function, then is a theory lemma.

*Definition 2 (clause item). *A clause item is one of the following three forms.(i)Init : an initial clause , where all are T-literals. (ii)Res : a clause obtained from a linear resolution of , where all are clauses items. (iii)Lemma : a theory lemma where all and are T-literals. The defined clause is .

A resolution chain is called a *linear regular resolution* chain if there exists a sequence of clauses such that: (1) is the resolvent of and ; (2) for , is the resolvent of and ; (3) all resolutions are applied on different T-atoms.

A clause item gives the syntax form for a clause. For each clause item , if it is well defined; the concrete form of (given as disjunction of T-literals) can be calculated. It is called *a concrete clause*, denoted by . We do not distinguish and when no ambiguity is caused.

*Definition 3 (certificate item). *A certificate item is either of the following: (i)Define ClauseItem,(ii)Forget ClauseItem.

*Definition 4 (certificate). *A certificate with respect to an SMT instance is a sequence of certificate items. The size of , denoted by , is the number of certificate items contained in . is the th element in , for .

*Definition 5 (context). *Given a certificate and a number , the context with respect to , denoted by , is a set of clause items such that:

Specially, denote by .

*Example 6. *Assume that , and , . A certificate for the unsatisfiability of the clause set is presented in Table 1. In this example, is intuitively unsatisfiable. Note that entails which implies in
T_{EUF}, while .

*Definition 7 (well-formed certificate). *A certificate is well formed if for all : (i)*if ** is a *Define* item then one of the following conditions hold: *it is of Init type; it is of Res type where for all , and there should be a linear regular resolution from to ; it is of Lemma type and the concrete clause is valid in theory T, that is, ;(ii)*otherwise ** is a Forget item and the referred clause must be defined in **. *

#### 4. Certificate Generation with DPLL(**T**)

The certificate generation algorithm is described in this section. We first describe the abstract rule and implementation issues, then prove some important properties, and finally give a couple of examples.

##### 4.1. The CDPLL(T) Algorithm

We describe here a certificate generation procedure that can be well adapted to the DPLL(T) algorithm. The extension of DPLL(T) with certificate generation is called CDPLL(T).

*Definition 8 (certificate refinement). *The certificate obtained by appending a certificate item to the end of the certificate is denoted by .

We use a transition system to model the CDPLL(T) algorithm. Each state is represented as a 3-tuple “,” where is the literal stack, the clause set to be satisfied, and the current certificate. The transition rules in CDPLL(T) are follows.

(i) Decide:

*Precondition*(1) or occurs in a clause of ; (2) is undefined in ,

(ii) UnitPropagate:

*Precondition*(1); (2) is undefined in ,

(iii) TheoryPropagate:

*Precondition *(1); (2) or occurs in ; (3) is undefined in ,

(iv) T-Backjump:

*Precondition *(1); (2)there exists a clause such that: (a) and , (b) is undefined in , and (c) or occurs in or in ;(3) is a set of clause items such that: for all , and there exists a linear regular resolution from to ,

(v) T-Learn(I): this rule models the clause learning procedure.

*Precondition*(1)Each atom of occurs in or in ; (2);(3) is a set of clause items such that: for all , and there exists a linear regular resolution from to ,

(vi) T-Learn(II): this rule models the learning of theory lemma.

*Precondition *(1)Each atom of occurs in or in ; (2),

(vii) T-Forget:

*Precondition* ,

(viii) Restart:

*Precondition* T

(ix) Fail:

*Precondition *(1); (2) contains no decision literals;(3) is a set of clause items such that: for all , and there exists a linear regular resolution from to ,

The initial state is where is the certificate that defines all initial clauses. When a final state is reached, is the obtained certificate for the unsatisfiability judgment.

The major differences between CDPLL(T) and DPLL(T) are as follows. (1) Certificate generation is added. Notice the modifications to rules T-Backjump, T-Learn, T-Forget, and T-Fail. (2) In order to generate well-formed certificates, the original T-Learn rule is split into 2 cases, one for the learning of resolvents (corresponds to clause learning) and the other for the learning of theory lemmas (corresponds to deductions in T). The first three rules are unchanged except the certificate part.

##### 4.2. Implementation of CDPLL(T)

In a real implementation of CDPLL(), additional information need to be collected to construct the certificate items. In this section, we discuss these details.

Lemma 9. *For any reachable state in CDPLL() and any clause , there exists a clause item such that .*

*Proof. *We prove that by induction. Initially, this property holds because all clauses in are initial clauses and consists of all initial clause items (by definition). At each valid transition step, the clause set and the certificate may be modified. However, each rule ensures that whenever a clause is added to , there is always a clause item such that added to , for example, in the rule T-Learn(I). Furthermore, whenever a clause item is removed from , the corresponding clause is removed in , for example, in the rule T-Forget. Thus, the property holds on all valid trace of CDPLL(T).

###### 4.2.1. Record Reasons

For any literal in , it is either added by the Decide rule or forced by a clause or theory lemma. For the latter case, we need to record a clause item which is a deduction of the lemma and from which the literal can be enforced. We call this item the *reason* for literal being in .

Only rules UnitPropagate, TheoryPropagate, and T-Backjump can enforce new literal to . We discuss in the following the reasons recorded by these three rules. (i)UnitPropagate: the reason is a certificate item such that . By Lemma 9, such item always exists. (ii)TheoryPropagate: assume that and ; then, the reason is a certificate item such that . (iii)T-Backjump: the reason for is a certificate item such that . We will show in the following that such certificate item always exists when T-Backjump is applicable. For convenience, a subscript is used to denote the reason, for example, denotes the literal asserted by the reason .

###### 4.2.2. Learn Clauses

In rules T-Backjump and Fail, once there is a conflict, a backjump clause needs be learned. Furthermore, if there are some branches that have not been explored, T-Backjump is applied; otherwise, Fail is applied. Each clause learning procedure introduces some new certificate items.

We use implication graph to analyze the clause learning process. Remember that in DPLL all literals are propositional atoms, the implication graph is simply a direct acyclic graph (DAG) where each edge corresponds to a deduction step. In CDPLL(T), the implication graph has two kinds of deductions: propositional deduction (which is similar to DPLL) and theory deduction. Each deduction step in the implication graph is labelled with a reason . This graph can be viewed as a *generalized implication graph*. Without causing ambiguity, we still call it an implication graph.

In the implication graph of DPLL, each cut containing the conflict literals corresponds to a learned clause (by resolving all the associated clauses of edges in this cut) [4]. No matter which criteria (e.g., 1-UIP) is used, the rule for clause learning is applicable. For CDPLL(T), when considering the generalized implication graphs, the clause learning rule is also applicable.

There are two situations where T-Backjump can be applied. Given a state , the first situation is that there exists a clause which is falsified by , that is, . Then, is called the *conflict clause*. As in the rule, some backjump clause should be learned by analyzing the conflict. Usually, is a resolvent of a linear resolution. A certificate item with resolution type will be learned.

The other situation is that itself becomes T-inconsistent. Then, a subset of is unsatisfiable, that is, . This is also a conflict clause. It may not belong to , but it can definitely be learned from along with some theory lemmas. The fact that leads to conflict is represented in the generalized implication graph. Furthermore, the clause that is required by the rule T-Backjump can also be learned by generalized conflict clause analysis.

In both situations, the backjump clauses are learned by clause learning. The certificate for unsatisfiability is actually a well-organized collection of these clause items. If Fail is applicable, no literal in is decision variable; then, the learnt clause is the empty clause which shows .

*Example 10. *A generalized implication graph is shown in Figure 2, where the literals are as follows: , , and is a propositional literal. Clauses are as follows:

Assume the current assignment . Then, there is a conflict between and . In all, there are 3 deductions in the implication graph: forced by , by and by the theory lemma . Among those reasons, solid lines correspond to existing clauses in , while dashed lines correspond to theory lemmas. is actually a theory lemma .

In the clause learning procedure, we start from the conflict and trace back (apply a sequence of resolutions). If the 1st UIP schema [4] is used, it is possible to learn or (corresponds to and , resp.). The resolution steps for learning are shown in Figure 3. Actually, is the resolvent of which are exactly the reasons labeled on the edges from the conflict to the cut. Among those, are normal clauses, and is a theory lemma.

Based on the above discussions, the related rules can be interpreted more precisely as follows.

(i) TheoryPropagate: The precondition requires that . Let be a subset of which entails (possibly ). Let , then . Furthermore, , which means is a unit clause under the assignment . Thus, is the reason of .

(ii) UnitPropagate: The precondition requires that ; thus, is a unit clause under . So, the reason of is .

(iii) T-Backjump: The precondition requires that there exits some clause such that and . The clause is then used as the reason for . As the clause learning is always applicable [9], this clause always exists. If the assignment is T-consistent, then there must be some conflict clause . As explained above, the clause learning will then be applied, and the learned clause will be . On the other hand, when is T-inconsistent, clause learning is also applicable in the generalized implication graph and generates a candidate . In both cases, can be learned by applying linear regular resolutions on a clause set [4]. Moreover, such is the union of a subset of reasons in and a group of theory lemmas.

(iv) Fail: The Fail rule is similar to T-Backjump except that the learned clause is the empty clause.

##### 4.3. Properties of CDPLL(T)

The soundness and completeness of CDPLL(T) are proven in this subsection.

Lemma 11. *A well-formed certificate modified by -, -, -, , or - is still well formed.*

*Proof. *Each of the 4 rules appends a new item to the certificate. Assume that the certificate is modified to where is the appended certificate item. The well-formed property of is checked by case studying the rules applied.(i)For T-Backjump: a certificate item of resolution type is appended. According to the precondition, the certificate items referred by are already defined in . So, is still well formed. (ii)For T-Learn(I): similar to the previous case. The dependency of appended certificate item is satisfied. (iii)For T-Learn(II): a theory lemma item is appended. It is required that . Thus, is still well formed. (iv)For Fail: similar to the T-Backjump case. (v)For T-Forget: the rule requires that the forgotten clause is in the clause set. By Lemma 9, we know that there is a certificate item in which defines . Thus, the well formedness is ensured.

With the help of this lemma, it can be proven that CDPLL() procedure always generates well-formed certificates.

Lemma 12. *Given any reachable state in any CDPLL() procedure, is a well-formed certificate. *

*Proof. *First of all, the initial certificate is well formed because it only contains initial items. Secondly, the certificate is only modified in T-Backjump, T-Learn(I), T-Learn(II), Fail, and T-Forget. By Lemma 11, these rules will preserve well formedness. So, following any path of CDPLL(T) will always generate a well-formed certificate. That completes the proof.

Theorem 13 (soundness). *Given a clause set , for any certificate , if is reachable in a CDPLL() procedure, then is also reachable in a DPLL() procedure. Specially, if 〈Fail〉 is reachable in CDPLL(), then 〈Fail〉 is also reachable in DPLL(). *

*Proof. *For each transition step in CDPLL(T):
it holds that
So, given a CDPLL(T) trace, a trace in DPLL(T) can be obtained by removing the certificate part. If is reachable in CDPLL(T) so is in DPLL(T).

Theorem 14 (completeness). *Given a clause set , if the state is reachable in a DPLL() procedure, then there must be a certificate such that is reachable in a DPLL() procedure. Furthermore, if 〈Fail〉 is reachable in a DPLL() procedure, then 〈Fail〉 is reachable in a CDPLL() and . *

*Proof. *For rules Decide, TheoryPropagate, UnitPropagate, T-Learn(II), T-Forget, and Restart, the preconditions are equal to those in CDPLL(T). So, if there is a transition in DPLL(T) labelled with one of these rules, there could also be the same transition in CDPLL(T).

For other rules, T-Backjump, T-Learn(I), and Fail, we need to prove that: if is reachable, then there is some certificate such that is reachable. We prove that by induction. Initially, this condition holds because the initial state corresponds to . Inductively, if
is a possible transition in DPLL(T), and there is some such that is reachable in CDPLL(T). (i)If it is labelled with T-Learn(I), because the learned clause is from clause learning, and clause learning corresponds to linear regular resolution. It is always possible that necessary theory lemmas in the implication graph are learned at first by Learn(II) and get to a state , on which the preconditions of Learn(I) are satisfied. Then, is reached. (ii)If it is labelled with T-Backjump or Fail, the situation is similar.

Our approach can be well adapted to any DPLL(T)-based SMT solver. It is sound and complete regardless of the theory learning scheme used or the order of the decision procedure. It works well as long as clause learning is based on the generalized implication graph.

##### 4.4. Examples

A couple of examples are discussed in this subsection. The first example is given as a CNF formula on propositional variables. In this case, the SMT problem reduces to an SAT problem. No theory deduction is involved in this example; so, we can get a general idea of the structure of certificates.

*Example 1.* Consider the following clause set:

Let ; assume that defines everything in as initial clause; then, a possible CDPLL(T) procedure is shown in Table 2, where “” means this field is the same as that in the previous row.

In step 4, is falsified. By analysing the implication graph, a clause is learned from . Among the referred clauses, is a falsified clause, and are in the reasons. In step 7, is falsified, but there is no decision variable. By analysing the implication graph, we can find a resolution from to the empty clause. Among the referred clauses, is a falsified clause, and are in the reasons.

*Example 2.* Consider the background theory
T_{EUF} and a clause set containing the following:
Its boolean structure is
where

A possible CDPLL(T) procedure is shown in Table 3, where the referred certificates are as follows:

In step 4, becomes T-inconsistent. The generalized implication graph is shown in Figure 4. A theory lemma is learned (instantiated the monotonicity property of the equality) firstly, then the backjump rule is applicable. In steps 11 and 12, becomes T-inconsistent again. A theory lemma is then learned (transitivity of equality), and then clause learning is performed which learns the empty clause. The implication graph is shown in Figure 5.

#### 5. The Certificate Checking

Lemma 15. *For any initial clause set , given a well-formed certificate and , if is a certificate item that defines a clause, then . *

*Proof. *We prove that by induction. The base case is easy to prove because for any initial item that defines a clause, . Therefore, it is trivial that . Thus, . (i)For resolution certificate items, assume that . By definition, all are all defined in . Then by the inductive hypothesis, we know . By the property of resolution, it is true that . (ii)For theory lemma, it is true that . So, .

Theorem 16. *Given a clause set , if a certificate for is well formed and , then is unsatisfiable. *

*Proof. *By Lemma 15, we know that in this case.

By Theorem 16, certificate checking is actually checking whether the certificate is well formed and contains the empty clause. Checking the well formedness of a certificate can directly follow Definition 7. Only one traverse from the first item to the last one is needed. Moreover, the following properties make the checking process even easier.(i)Only checking for theory lemma is performed in theory T. Once a theory lemma is proved valid, it can be treated as a propositional clause. (ii)For each individual proof rule, we have a dedicated checker to check if the theory lemma is an instance of the rule. In this way, the certificate checker is extensible. Given a new proof rule, we need only to add the corresponding lemma checker. Notice that each proof rule defines an atomic deduction step; the checking effort is much less than solving a constraint in the theory T. For instance, only pattern matching is needed to check if a theory lemma is an instance of the monotonicity property of equality.

#### 6. Experiments

Two criteria are adopted to assess our certificate approach: the overhead for generating certificates, and the cost for certificate checking. We implemented an SMT solver aCiNO based on the CDPLL(T) algorithm. It uses the Nelson-Oppen framework to solve combined theory. Currently,
T_{EUF} and
T_{LRA} are considered. The experiments are carried out on a machine with a E7200 dual core CPU (2.53 GHz per core) and 2.0 GB RAM.

All test cases are taken from SMT-LIB [22]. Our approach is tested on 19 unsatisfiable instances from different folders (in order to test it on different kinds of problem instances). Experiment results are shown in Table 4.

In Table 4, the first column-group describes the scale of the input problem. The input is transformed to its equivalent conjunctive normal form, and the number of literals and clauses are counted. In the CDPLL(T) procedure, once a closed branch is encountered, a new clause will be learned. The number of learned clauses is listed in the second column-group. This number is equal to that in DPLL(T) since certificate generation will not affect the search procedure of the SMT problem. As in the DPLL(T) algorithm, the number of clause learning can be quite large (e.g., the 17th case). The time used in CDPLL(T) is also presented. The third column-group summarises the generated certificates. It is obvious that the number of theory lemma items is usually very large. The column of “Forget” is the number of forget items that forgets an initial or resolution certificate item. All theory lemmas are eventually forgotten, these items are not counted here. For SMT problem, this is not surprising since the major work of SMT solvers is in reasoning in the background theory. The resources and time required to check the certificates are listed in the *Checking* column-group. All certificates are checked to be well formed. The “Mem” column is not the size of memory consumed but the maximal number of clause items that are stored in the memory during checking. Also, the time for certificate checking is listed besides.

Regarding the certificate checking, the number of initial clauses and memory consumption is compared in Figure 6. It is rather interesting that although the certificate itself could be exponentially large with respect to the input problem, the memory consumption will not grow in the same way. The reason is that we have explicitly forgotten many items. In particular, many theory lemma are referred locally. They are only available in a short period of time, after which they are forgotten. With the technique of forgetting items, the certificate checking becomes more efficient. This is also supported by data from the “Time” column in Table 4.

Towards the overhead of our approach, the experiment is shown in Table 5. Tested cases are also from SMT-LIB. In order to suppress the inaccurate measurement of time, we intentionally selected time consuming cases (e.g., longer than 0.01 sec). The maximal overhead is about 27.33%, and the average overhead is about 10.5%. It is much smaller compared to other approaches [17, 21].

#### 7. Conclusion

In this paper, a unified certificate framework for quantifier-free SMT instances was presented. The certificate format is simple, clean, and extensible to other background theories. The certificate generation procedure can be easily integrated to any DPLL(T)-based SMT solver. Soundness and completeness of the extension of DPLL(T) with the certificate generation procedure were established. Experimental results show that our certificate framework outperforms others in both certificate generation and certificate checking.

#### Acknowledgments

This work was supported by the Chinese National 973 Plan under Grant no. 2010CB328003; the NSF of China under Grants nos. 61272001, 60903030, and 91218302; the Chinese National Key Technology R&D Program under Grant no. SQ2012BAJY4052; the Tsinghua University Initiative Scientific Research Program.

#### References

- C. Barrett, M. Deters, L. de Moura, A. Oliveras, and A. Stump, “6 Years of SMT-COMP,”
*Journal of Automated Reasoning*, vol. 50, no. 3, pp. 243–277, 2013. View at Publisher · View at Google Scholar - C. Barrett and C. Tinelli, “CVC3,” in
*Proceedings of the 19th International Conference on Computer Aided Verification*, pp. 298–302, Springer, July 2007. - L. Moura and N. Bjrner, “Z3: an efficient SMT solver,” in
*Tools and Algorithms for the Construction and Analysis of Systems*, C. Ramakrishnan and J. Rehof, Eds., vol. 4963 of*Lecture Notes in Computer Science*, pp. 337–340, Springer, Berlin, Germany, 2008. View at Google Scholar - P. Beame, H. Kautz, and A. Sabharwal, “Understanding the power of clause learning,” in
*Proceedings of the International Joint Conference on Artificial Intelligence*, pp. 1194–1201, Citeseer, Acapulco, Mexico, August 2003. - J. Silva, “An overview of backtrack search satisfiability algorithms,” in
*Proceedings of the 5th International Symposium on Artificial Intelligence and Mathematics*, Citeseer, January 1998. - P. Beame and T. Pitassi, “Propositional proof complexity: past, present, and future,” in
*Bulletin of the European Association for Theoretical Computer Science*, The Computational Complexity Column, pp. 66–89, 1998. View at Google Scholar - S. Cook, “The complexity of theorem-proving procedures,” in
*Proceedings of the 3rd Annual ACM Symposium on Theory of Computing*, pp. 151–158, ACM, Shaker Heights, Ohio, USA, 1971. - M. Davis, G. Logemann, and D. Loveland, “A machine program for theorem-proving,”
*Communications of the ACM*, vol. 5, pp. 394–397, 1962. View at Google Scholar - R. Nieuwenhuis, A. Oliveras, and C. Tinelli, “Solving SAT and SAT modulo theories: from an abstract davis—putnam—logemann—loveland procedure to DPLL(T),”
*Journal of the ACM*, vol. 53, no. 6, Article ID 1217859, pp. 937–977, 2006. View at Publisher · View at Google Scholar · View at Scopus - M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik, “Chaff: engineering an efficient SAT solver,” in
*Proceedings of the 38th Design Automation Conference*, pp. 530–535, June 2001. View at Scopus - N. Een and N. Sorensson, “An extensible SAT-solver,” in
*Theory and Applications of Satisfiability Testing*, pp. 333–336, Springer, Berlin, Germany, 2004. View at Google Scholar - A. Biere, “PicoSAT essentials,”
*Journal on Satisfiability, Boolean Modeling and Computation*, vol. 4, article 45, 2008. View at Google Scholar - M. Boespug, Q. Carbonneaux, and O. Hermant, “The $\lambda \pi $-calculus modulo as a universal proof language,” in
*Proceedings of the 2nd International Workshop on Proof Exchange for Theorem Proving (PxTP '12)*, June 2012. - A. Stump, D. Oe, A. Reynolds, L. Hadarean, and C. Tinelli, “SMT proof checking using a logical framework,”
*Formal Methods in System Design*, vol. 42, no. 1, pp. 91–118. - D. Deharbe, P. Fontaine, B. Paleo et al., “Quantifier inference rules for SMT proofs,” 1st International Workshop on Proof eXchange for Theorem Proving (PxTP '11), 2011.
- P. Fontaine, J. Y. Marion, S. Merz, L. Nieto, and A. Tiu, “Expressiveness + automation + soundness: towards combining SMT solvers and interactive proof assistants,” in
*Tools and Algorithms for the Construction and Analysis of Systems*, H. Hermanns and J. Palsberg, Eds., vol. 3920 of*Lecture Notes in Computer Science*, pp. 167–181, Springer, Berlin, Germany, 2006. View at Google Scholar - D. Oe, A. Reynolds, and A. Stump, “Fast and flexible proof checking for SMT,” in
*Proceedings of the 7th International Workshop on Satifiability Modulo Theories (SMT '09)*, pp. 6–13, ACM, August 2009. View at Publisher · View at Google Scholar · View at Scopus - A. Cimatti, A. Griggio, and R. Sebastiani, “A simple and exible way of computing small unsatisfiable cores in SAT modulo theories,” in
*Theory and Applications of Satisfiability Testing SAT*, J. Marques-Silva and K. Sakallah, Eds., vol. 4501 of*Lecture Notes in Computer Science*, pp. 334–339, Springer, Berlin, Germany, 2007. View at Google Scholar - Y. Ge and C. Barrett, “Proof translation and SMT-LIB benchmark certification: a preliminary report,” in
*Proceedings of International Workshop on Satisfiability Modulo Theories (SMT '08)*, August 2008. - S. Bohme, “Proof reconstruction for Z3 in Isabelle/HOL,” in
*Proceedings of the 7th International Workshop on Satisfiability Modulo Theories (SMT '9)*, August 2009. - M. Moskal, “Rocket-fast proof checking for SMT solvers,” in
*Tools and Algorithms For the Construction and Analysis of Systems*, C. Ramakrishnan and J. Rehof, Eds., vol. 4963 of*Lecture Notes in Computer Science*, pp. 486–500, Springer, Berlin, Germany, 2008. View at Google Scholar - C. Barrett, A. Stump, and C. Tinelli, “The SMT-LIB standard: version 2.0,” in
*Proceedings of the 8th International Workshop on Satisfiability Modulo Theories*, Edinburgh, UK, 2010.