Research Letter  Open Access
Sudarshan K. Srinivasan, Koushik Sarker, Rajendra S. Katti, "TokenAware Completion Functions for Elastic Processor Verification", Journal of Electrical and Computer Engineering, vol. 2009, Article ID 480740, 5 pages, 2009. https://doi.org/10.1155/2009/480740
TokenAware Completion Functions for Elastic Processor Verification
Abstract
We develop a formal verification procedure to check that elastic pipelined processor designs correctly implement their instruction set architecture (ISA) specifications. The notion of correctness we use is based on refinement. Refinement proofs are based on refinement maps, whichβin the context of this problemβare functions that map elastic processor states to states of the ISA specification model. Data flow in elastic architectures is complicated by the insertion of any number of buffers in any place in the design, making it hard to construct refinement maps for elastic systems in a systematic manner. We introduce tokenaware completion functions, which incorporate a mechanism to track the flow of data in elastic pipelines, as a highly automated and systematic approach to construct refinement maps. We demonstrate the efficiency of the overall verification procedure based on tokenaware completion functions using six elastic pipelined processor models based on the DLX architecture.
1. Introduction
The impact of persistent technology scaling results in a previously ignored set of design challenges such as manufacturing and process variability and increasing significance of wire delays. These challenges threaten to invalidate the effectiveness of synchronous design paradigms at the system level. Several alternate design paradigms to deal with these challenges are being proposed. One popular trend is latencyinsensitive designs, which allows for variability in data propagation delays [1]. Synchronous Elastic Networks (SENs) [2, 3] hav been proposed as an effective approach to design latencyinsensitive systems.
One of the critical challenges for any design approach to succeed is verification. We present a novel highly automated formal verification solution for latencyinsensitive pipelined microprocessors developed using the SEN approach (here on referred to as elastic processors). Note that correctness proofs for methods to synthesize elastic designs from synchronous designs have been provided [2], but this is not a substitute for verification. The idea with the verification approach is to show that the elastic processor correctly implements all behaviors of its instruction set architecture (ISA) model, which is used as the highlevel specification for the processor. The notion of correctness that we use is WellFounded Equivalence Bisimulation (WEB) refinement, a detailed description of which can be found in [4]. It is sufficient to prove that the elastic processor (implementation) and its ISA (specification) satisfy the following core WEB refinement correctness formula to establish that the elastic processor refines (correctly implements) its ISA.
Definition 1.1 (Core WEB Refinement Correctness Formula). One has
In the formula above, IMPL denotes the set of implementation states, Istep is a step of the implementation machine, and Sstep is a step of the specification machine. The refinement map is a function that maps implementation states to specification states. In fact, the refinement map can be thought of as an instrument to view the behaviors of the implementation machine at the specification level, thereby allowing verification tools to easily compare the behaviors of the two systems. is used for deadlock detection. Our focus in this work is to check safety, that is, to show that if the implementation makes progress, then, the result of that progress is correct as specified by the highlevel specification. We plan to address deadlock detection in future work, and we therefore ignore for the present.
The specific steps involved in a refinementbased verification methodology are (a) construct models of the specification and implementation, (b) compute the states of the implementation model that are reachable from reset (known as reachable states), (c) construct a refinement map, and (d) the models and the refinement map can now be used to state the refinementbased correctness formula (excluding deadlock detection) for the implementation model, which can then be automatically checked for the set of all reachable states using a decision procedure. Modeling and verification are performed using ACL2SMT [5], a system developed by combining the ACL2 theorem prover (version 3.3) with the Yices decision procedure (version 1.0.10) [6].
The primary challenge in applying the refinementbased approach to elastic pipelines is as follows. The very attractive property of elastic systems is that they allow for the insertion of buffers (known as elastic buffers) in any place in the data path to deal with propagation delays of long wires, without altering the functionality of the system. The insertion of these buffers however can drastically change the data flow patterns of the system, making it hard to compute refinement maps for these systems. Our primary contribution is a procedureβwe call tokenaware completion functionsβthat computes refinement maps for elastic pipelined systems (described in Section 3) even after the insertion of elastic buffers in any place in the data path. The procedure allows for highly automated and efficient verification of elastic pipelined systems. The effectiveness of our verification method is demonstrated using 6 DLXbased elastic pipelined processor models. The models are described in Section 2. Verification results are given in Section 4, and we conclude in Section 5. Due to limited space, we request the reader to refer to literature for background on synchronous elastic networks [2, 3] and refinement [4].
Note that this is the first known approach that aims to verify the correctness of elastic pipelined processors against their highlevel nonpipelined ISA specifications. In previous work, we have developed an equivalence checking approach that is used to verify elastic pipelines against their synchronous parent pipelines [7].
2. Elastic Processor Models
The elastic processor models are based on the 5stage DLX pipeline. The elastic processor models and their nonpipelined ISAlevel specifications are described using the ACL2 programming language and are defined at the termlevel, because termlevel abstractions make the verification problem tractable. We use the ACL2SMT system for verification as it can be used to reason at the termlevel. Note that bitlevel versions of these models were used in [7]. The models were obtained by first elasticizing a synchronous 5stage DLX processor using the Synchronous Elastic Flow (SELF) protocol approach [2]. The main idea is to replace all flip flops with elastic buffers (EBs) that are constructed from two elastic half buffers (EHBs), namely, a master EHB and a slave EHB. The clock network is replaced by a network of elastic controllers, where each controller is used to control the elastic buffers in a pipeline stage and synchronized with the controllers of adjacent pipeline stages. The controllers are synchronized with the clock and are connected in accordance with connections between pipeline stages in the data path. Each controller has three possible states, , , and , which indicate that the corresponding elastic buffer has 0, 1, and 2 valid data tokens, respectively.
We call the processor model obtained by elasticizing the synchronous DLX . The main advantage of the elastic processor is that it permits the insertion of additional elastic buffers at any place in the data path to break long wires. We therefore inserted additional elastic buffers at various places in the model. We inserted in model to get model . We then inserted in model to get . We derived models , , and in a similar manner. The model M5 is shown in Figure 1. The figure also shows the positions of the additional elastic buffers and how they are connected with the elastic buffers corresponding to the pipeline latches (namely and ). The network of elastic controllers for the DLX processor with five additional elastic buffers in the data path is shown in Figure 2. These models are used to demonstrate the effectiveness of our verification approach.
3. TokenAware Completion Functions
Flushing [8] is one standard approach used to compute refinement maps for pipelined processors. In this approach, partially executed instructions in the pipeline latches are forced to complete, without allowing the machine to fetch any new instructions. Projecting out the programmer visible componentsβwhich include the program counter, register file, instruction memory, and data memory for the models we considerβin the resulting state will give the corresponding ISA state.
Completion functions [9] were proposed as a computationally efficient approach to construct flushing refinement maps. One completion function for each pipeline latch in the machine is used to compute the effect on the programmer visible components of completing any partially executed instruction in that latch. The completion functions are composed to form the flushing refinement map. Note that older instructions in the pipeline are completed before younger instructions. For the DLX example, let , , , and be the completion functions for the latches , , , and , respectively. Let , , and be the register file, instruction memory, and the data memory of the processor model. The ISA state corresponding to a synchronous DLX processor state () is = fdc(dec(emc(mmc , .
When we try to apply the completion functions approach to elastic pipelined processors, two issues arise. First, in some states of the elastic processor, instructions can be duplicated in the data path; that is, an instruction can reside in two pipeline latches. Such a situation can occur at a fork when the instruction in a buffer before the fork has proceeded along one path of the fork, but the other path is blocked. The latch before the fork has to retain the instruction until both paths are cleared. A direct application of the completion functionsbased map to such a state will result in completing the same instruction twice leading to an erroneous refinement map. Second, Elastic Half Buffers (EHBs) need not have valid tokens. The contents of such EHBs should be ignored and should not be used to update the programmer visible components.
We introduce tokenaware completion functions as a method to compute flushingbased refinement maps for elastic pipelined processors. The idea being that EHBs which are either holding duplicate instructions or are in an empty state should not be completed. This is achieved by first computing the reachable states of the elastic controller network. We use tokenflow diagrams proposed in [7] to compute the reachable states of the system. The reachability analysis is performed by simulating how tokens flow in the elastic architecture using a form of symbolic simulation. The output of the tokenflow diagrams is a set of tokenstates, one tokenstate for each reachable state. In a tokenstate, each EHB is assigned a numbered token, which is essentially a natural number. A value of β0β indicates a bubble; that is, the EHB is empty. Also, EHBs with the same instruction will be assigned the same token numbers. Thus, using the tokenstate, duplicate instructions and empty EHBs can be identified.
The tokenaware completion functions approach works by first computing a twodimensional array; we call tokenarray. Each row in the array corresponds to a reachable state of the elastic controller network. Each element in a row is a binary value. The number of elements in a row is , where is the number of pipeline latches in the elastic system. If tokenarray , then the contents of EHB in the reachable state should be completed. If tokenarray , then the contents of EHB in the reachable state should be ignored when computing the refinement map. Given the set of tokenstates (which are the reachable states represented using numbered tokens) of the elastic controller network of an elastic system, Procedure 1 computes the tokenarray for the elastic system.
Procedure. In:, set of tokenstates of the elastic controller network and , the ordered set of pipeline half buffers. The number of token states () is . The number of pipeline half buffers () is , where is the number of pipeline latches. The order of the pipeline half buffers is determined by the position of the buffer in the pipeline; that is, buffers closer to the end of the pipeline have a higher index.Out:tokenarray for the elastic system.(1) Initialize to .(2) Initialize (the set of visited tokens) to . The token number β0β represents a bubble. Note that initializing to causes the procedure to assign a β0β value to the empty EHBs in the tokenarray.(3) Initialize to .(4) Let token , where is a lookup function that gives the token number for EHB in tokenstate .(5)tokenarray (6) Assign : add the token number of EHB to the visited token set.(7) If , decrement and go to step 4.(8) If , decrement and go to step 2. Procedure 2 takes as input the tokenarray and computes the flushing refinement map for the elastic system using completion functions.
Procedure. In: Elastic processor state : . are the programmer visible components, and are the half buffers in the pipeline latches of the elastic machine.Out: ISA state obtained by applying the flushing refinement map to .(1) Let . (2) Initialize to .(3)reachablestate , gives the number of the reachable elastic controller network state of , assuming that the reachable states are numbered.(4)One has (5) If , decrement and go to step 3.(6)Then,
Example 3.3. The elastic controller network of the processor model has two reachable states and . The tokenstates and (given as a vector of token numbers for the EHBs in in the order ) corresponding to these reachable states and , respectively, are and [7]. Note that there are two tokens in the tokenstates for each EB, one corresponding to the master EHB and the other corresponding to the slave EHB. The completion functionbased refinement map obtained using Procedures 3.1 and 3.2 for any state of processor model whose elastic controller network is state is , . The completion functionbased refinement map obtained using Procedures 3.1 and 3.2 for any state of processor model whose elastic controller network is state is = fdc(fdc(dec(mmc , .
4. Results
The tokenaware completion functions approach was used to verify safety for six elastic pipelined processors . The results are shown in Table 1. Verification was performed using the ACL2SMT system. The ACL2SMT system incorporates a translator that reduces the correctness theorem to a decision problem in the form of a formula in a decidable logic that Yices can handle. The decision problem is then checked by Yices. Column βBool Varsβ gives the number of Boolean variables in the decision problem. The experiments were conducted on a 1.8βGHz Intel (R) Core(TM) Duo CPU, with an L1 cache size of 2048βKB. As can be seen from the table, each of the elastic 5stage DLXbased processors was verified against the highlevel instruction set architecture (ISA) within 25 seconds, thereby demonstrating the high efficiency of our approach.

5. Conclusions
We have developed a method for checking the correctness of elastic pipelined processors against their highlevel instruction set architectures. The approach was demonstrated by verifying 6 DLXbased elastic processor models. For future work, we plan to further explore the scalability of the verification method.
References
 L. P. Carloni, K. L. McMillan, and A. L. SangiovanniVincentelli, βTheory of latencyinsensitive design,β IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 20, no. 9, pp. 1059β1076, 2001. View at: Publisher Site  Google Scholar
 J. Cortadella, M. Kishinevsky, and B. Grundmann, βSynthesis of synchronous elastic architectures,β in Proceedings of the 43rd annual Design Automation Conference (DAC '06), E. Sentovich, Ed., pp. 657β662, San Francisco, Calif, USA, July 2006. View at: Publisher Site  Google Scholar
 S. Krstic, J. Cortadella, M. Kishinevsky, and J. O'Leary, βSynchronous elastic networks,β in Formal Methods in Computer Aided Design (FMCAD '06), pp. 19β30, IEEE Computer Society, San Jose, Calif, USA, November 2006. View at: Publisher Site  Google Scholar
 P. Manolios, Mechanical Verification of Reactive Systems, Ph.D. thesis, University of Texas, Austin, Tex, USA, August 2001, http://www.ccs.neu.edu/home/pete/research/phddissertation.html.
 S. K. Srinivasan, Efficient verification of bitlevel pipelined machines using refinement, Ph.D. thesis, Georgia Institute of Technology, December 2007, http://etd.gatech.edu/theses/available/etd08242007111625/.
 Yices, 2007, http://fm.csl.sri.com/yices/.
 S. K. Srinivasan, K. Sarker, and R. S. Katti, βVerification of synchronous elastic processors,β to appear in IEEE Embedded Systems Letters. View at: Google Scholar
 J. R. Burch and D. L. Dill, βAutomatic verification of pipelined microprocessor control,β in Proceedings of the 6th International Conference on Computer Aided Verification (CAV '94), vol. 818 of Lecture Notes in Computer Science, pp. 68β80, Springer, Stanford, Calif, USA, June 1994. View at: Publisher Site  Google Scholar
 R. Hosabettu, M. Srivas, and G. Gopalakrishnan, βProof of correctness of a processor with reorder buffer using the completion functions approach,β in Proceedings of the 11th International Conference Computer Aided Verification (CAV '99), N. Halbwachs and D. Peled, Eds., vol. 1633 of Lecture Notes in Computer Science, Springer, Trento, Italy, July 1999. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2009 Sudarshan K. Srinivasan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.