EURASIP Journal on Embedded Systems
Volume 2008 (2008), Article ID 647953, 13 pages
doi:10.1155/2008/647953
Research Article

System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design

Center for Embedded Computer Systems, University of California, Irvine, CA 92697-2625, USA

Received 1 October 2007; Revised 4 March 2008; Accepted 10 June 2008

Academic Editor: Christoph Grimm

Copyright © 2008 Rainer Dömer et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The constantly growing complexity of embedded systems is a challenge that drives the development of novel design automation techniques. C-based system-level design addresses the complexity challenge by raising the level of abstraction and integrating the design processes for the heterogeneous system components. In this article, we present a comprehensive design framework, the system-on-chip environment (SCE) which is based on the influential SpecC language and methodology. SCE implements a top-down system design flow based on a specify-explore-refine paradigm with support for heterogeneous target platforms consisting of custom hardware components, embedded software processors, dedicated IP blocks, and complex communication bus architectures. Starting from an abstract specification of the desired system, models at various levels of abstraction are automatically generated through successive step-wise refinement, resulting in a pin-and cycle-accurate system implementation. The seamless integration of automatic model generation, estimation, and verification tools enables rapid design space exploration and efficient MPSoC implementation. Using a large set of industrial-strength examples with a wide range of target architectures, our experimental results demonstrate the effectiveness of our framework and show significant productivity gains in design time.

1. Introduction

The rising complexity of embedded systems challenges the established design techniques and processes. Novel, nontraditional design approaches become necessary in order to keep up with the increasing demands of higher productivity.

A well-known technique to address the system design challenge is system-level design which raises the level of abstraction, exploits the reuse of intellectual property (IP), and integrates the traditionally separate design processes of the heterogeneous system components. By combining the design flows of hardware units, software processors, third-party IPs, and the interconnecting bus architectures, system-level design emphasizes the system perspective of the overall design task and enables design space exploration across domains. However, successful system design depends on efficient design automation techniques and, in particular, effective tool support.

In this article, we describe the system-on-chip environment (SCE), a system-level design framework based on the SpecC language and methodology [1]. SCE realizes a top-down refinement-based system design flow with support of heterogeneous target platforms consisting of custom hardware components, embedded software processors, dedicated IP blocks, and complex communication bus architectures.

1.1. SCE Methodology

Figure 1 shows the design flow with SCE in an overview. Starting with an abstract specification model in the system design phase, the designer automatically generates transaction level models (TLM) of the design, successively at lower levels of abstraction. Based on component models from the system database and design decisions made by the user, the generated models carry an increasing amount of implementation details.

Figure 1: System-on-chip environment (SCE) design flow.

SCE follows a specify-explore-refine methodology [2]. The design process starts from a model specifying the design functionality (specify). At each following step, the designer first explores the design space (explore) and makes the necessary design decisions. SCE then automatically generates a new model by integrating the decisions into the previous model (refine).

After the system design phase is complete, the hardware and software components in the system model are implemented by the hardware and software synthesis phases, respectively. As a combined result, a pin- and cycle-accurate implementation model is generated. Also, binary images for the software processors, as well as register-transfer level (RTL) descriptions in Verilog for the hardware blocks, are created for further synthesis and manufacturing of the intended Multiprocessor system-on-chip (MPSoC).

Three design models used in the SCE design flow are shown in more detail in Figure 2. In these and later figures that describe design models, we use the graphical notation introduced with the SpecC language [1]. In general, rectangular boxes represent components, and interconnections are indicated by lines (wires) and arrows (busses). Encapsulated computational blocks, called behaviors, are shown as rectangular boxes with round corners, whereas high-level communication is indicated by channels (ellipses) and interfaces (half circles).

Figure 2: Generic SCE design models.

Figure 2(a) depicts a simple generic specification model. The model consists of a hierarchy of five behaviors and four communication channels. Except for the system functionality, this model is free of any implementation details. During the system design phase, it will be mapped to a platform architecture (see Section 3.1) and single-threaded processing elements (PEs) will be scheduled (Section 3.2). Communication elements (CEs), such as bus bridges and transducers, and system busses will be added to the model as well (Sections 3.3 and 3.4).

As a result of each of these model refinement steps, a TLM is generated, as shown in Figure 2(b). Depending on the number of implementation decisions taken, the TLM accurately reflects the number and type of PEs in the architecture, the mapping of behaviors to the PEs, and the mapping of channels to the system busses and CEs. Note that the communication in this model is still at the abstract transaction level.

After hardware and software synthesis (see Sections 3.5 and 3.6, resp.), a cycle-accurate implementation model is generated, as illustrated in Figure 2(c). In this model, embedded software is represented in detailed layers, including the real-time operating system (RTOS) and the hardware abstraction layer (HAL). Custom hardware blocks, on the other hand, are represented accurately by RTL finite state machine (FSM) models. Finally, system communication is also refined down to a pin- and cycle-accurate level.

1.2. Related Work

Traditionally, system design is dominated by simulation-centric approaches with horizontal integration of models at specific levels of abstraction. Approaches range from the cosimulation of different low-level languages [35] to the combination of heterogeneous models of computation in a common simulation environment [6]. In between, C-based system-level design languages (SLDLs), such as SystemC [7] and Handel-C [8], emerged as vehicles for transaction-level modeling (TLM) [9]. Most cases, however, are limited to simulation only and lack vertical integration with synthesis flows that provide a path to implementation.

The first attempts at providing system design environments were approaches for hardware/software codesign. Examples of such environments include COSYMA [10], COSMOS [11], and POLIS [12]. These approaches, however, are based on architecture templates consisting of a single microcontroller assisted by a custom hardware coprocessor, and are thus limited to narrow target architectures.

More recently, design environments emerged that provide support for more complex multiprocessor systems. The OCAPI system [13, 14] is based on an object-oriented modeling of designs using a C++ class library and focuses on reconfigurable hardware devices (FPGAs). The OSSS methodology [15] defines an automated system design flow from a cycle-accurate specification written in an object-oriented variant of SystemC. Supporting architecture exploration and automated refinement via intermediate design models, OSSS feeds into the FOSSY synthesis tool for implementation in hardware and software.

Around the TLM concept, several SystemC-based approaches exist that deal with assembly, validation and to some extent automatic generation of communication [1620]. Metropolis [21, 22] is a modeling and simulation environment based on the platform-based design paradigm. The key idea is to separate function, architecture, and model of computation into separate models. Although Metropolis allows cosimulation of heterogenous PEs as well as different models of computation, a refinement or verification flow between different abstraction levels has not emerged. None of the above frameworks provides a comprehensive, automated approach for the design of complete MPSoCs from abstract specification down to final implementation.

SCE was built on experiences obtained from its predecessor, SpecSyn [2]. While SpecSyn was based on the SpecCharts language, an extension of VHDL, SCE is based on SpecC, which extends ANSI-C for hardware and system modeling.

With respect to our previous publications (previous publications focus on point-tools within the SCE environment and are referenced where applicable), this article is the first comprehensive, cohesive, and complete description of the SCE framework. In other words, for the first time we describe the entire SpecC methodology as implemented by a real working environment. As such, this article focuses on the integration of the tools (including scripting facilities, file formats, and annotations) that realize an efficient top-down system design flow, all the way from an abstract system specification down to a pin- and cycle-accurate implementation. We also list the design decisions taken at each step and thus provide a complete picture of the input the system designer needs to provide based on his application knowledge and design experience. We also demonstrate the effectiveness of the SCE framework and the complete design flow using the combined results of six design experiments using real-world examples. Furthermore, this article describes for the first time the integration of verification tools into the design flow and framework.

2. SCE Architecture

SCE is based on the separation of design tasks into two distinct steps: decision making and model refinement. Model refinement takes design decisions and generates a new model of the design reflecting and implementing the decisions.

In SCE, model refinement is automated. Decisions, on the other hand, can be entered manually or through a tool box of automated synthesis algorithms. Together, SCE supports an interactive and automated system design process. Automatic model generation removes the need for error-prone and tedious model rewriting. Instead, designers can focus on design exploration and decision making.

Figure 3 shows the generic software architecture for each task in the SCE design and refinement flow. In each step, design decisions are entered by the user through a graphical user interface (GUI), via a command-line scripting and shell interface, or with the help of automated synthesis plugins implementing optimizing algorithms. Based on the design decisions, a refinement process generates a new design model from the input model automatically.

Figure 3: SCE software architecture.

Overall, the SCE framework is formed by the combination of point tools. These tools exchange information through command line interfaces and design models. In general, all tools operate on a given design model. Design decisions, profiling data, and metainformation about the design are stored as annotations attached to the corresponding objects in the design and database models. All models and databases in SCE are described and captured in the form of SpecC internal representation (SIR) files. Using the SpecC compiler (), SCE models and databases can be imported from and exported into source files in standard SpecC language format at any time.

2.1. Graphical User Interface

The main interface between the designer and the tools is the GUI [23] which provides various displays and dialogs for browsing of design models and databases, interactive decision entry, and graphical analysis of profiling and estimation results. Furthermore, it includes menus and tool bars to trigger simulation, profiling, refinement, synthesis, and verification actions. For each action, specific command-line tools are called and executed as needed where the GUI supplies the necessary parameters, captures the output and handles (normal or abnormal) results.

In each session, multiple candidate designs and models can be explored and generated. Information about design models and their relationships, including project-specific compiler and simulator parameters, are tracked by the GUI and can be stored in project files in a custom XML format, allowing for persistent storage, documentation, and exchange of metainformation about the exploration process.

2.2. Simulation and Profiling

All design models in the SCE flow are executable for validation through simulation. Using the SpecC compiler and simulator, models can be compiled and executed at any time. SCE also includes profiling tools to obtain feedback about design quality metrics. Based on a combination of static and dynamic analysis, a retargetable profiler () provides a variety of metrics across various levels of abstraction [24]. Initial dynamic profiling derives design characteristics through simulation of the input model. The system designer chooses a set of target PEs, CEs, and busses from the database, and the tool then combines the obtained profiles with the characteristics of the selected components. Thus, SCE profiling is retargetable for static estimation of complete system designs in linear time without the need for time consuming resimulation or reprofiling.

The profiling results can also be back-annotated into the output model through refinement. By simulating the refined model, accurate feedback about implementation effects can then be obtained before entering the next design stage.

Since the system is only simulated once during the exploration process, the approach is fast yet accurate enough to make high-level decisions, since both static and dynamic effects are captured. Furthermore, the profiler supports multilevel, multimetric estimation by providing relevant design quality metrics for each stage of the design process. Therefore, profiling guides the user in the design process and enables rapid and early design space exploration.

2.3. Verification

SCE also integrates a formal verification tool . Our equivalence verification technology is based on model algebra [25], which is a formalism for symbolic representation and transformation of system level models. The formalism itself consists of a set of objects and composition rules. The objects are behaviors, synchronization channels, variables, and ports. The composition rules for control flow, blocking, and nonblocking communication, and hierarchy allow creation of formal models. Functionality preserving transformation rules are also defined on model algebraic expressions. Each of these transformation rules are proven sound with respect to a trace-based notion of functional equivalence.

The incorporation of model algebra-based verification in SCE follows the refinement flow. Well-formed models in SpecC can easily be translated to respective model algebraic expressions. The system designer simply selects an original and a refined model and invokes the verification tool. then converts the models and applies the transformation rules to derive the refined model from the original model. The two models are equivalent by virtue of the soundness of the transformation rules. The original model is then checked for isomorphism against the derived model and the differences, if any, are reported. It must be noted that the number and order of transformation rules used for the model derivation step depend on the type of refinement. Since the key concept in SCE is the well-defined semantics of models at different abstraction levels, the order of transformation rules can be easily established. Therefore, equivalence verification becomes not only tractable, but straightforward.

2.4. Databases

In the SCE design flow, the system is gradually refined using system components from a set of databases [26]. Specifically, SCE includes databases for processing elements (PEs), communication elements (CEs), operating system models, bus or other communication protocols, RTL units and software components. The database components are described as SpecC objects (behaviors or channels). The SpecC hierarchy for a component object in the database defines its structure and functionality for simulation and synthesis. In addition, metadata, such as attributes, parameters, and general information, is stored in the form of annotations attached to the components.

2.5. Scripting Interface

SCE supports scripting of the complete environment from the command line without the need for the GUI. For scripting purposes, a GUI-less command shell, , of SCE is available. The SCE shell is based on the same libraries as the SCE GUI (not including the GUI layer itself) and offers interactive command-prompt based- or automatic script-based execution.

The SCE shell is based on an embedded Python interpreter that is extended with an API for low-level access to SCE core functionality and internals. For user-level scripting, a complete set of high-level tools on top of the SCE shell are available. Provided scripts include command-line utilities for component allocation (), mapping/partitioning (), scheduling (), connectivity definition (), component import (), and project handling (). These scripts provide a convenient command-line interface for all SCE high-level functionality and decision entry. Together with command-line interfaces to refinement tools and the compiler, a complete scripting of the SCE design flow, through shell scripts or via Makefiles, is available.

3. SCE Design Flow

Figure 4 shows the refinement-based tool flow in SCE from the initial abstract specification down to the final implementation model. In particular, the SCE flow consists of six specific tools which we will describe in the following sections.

Figure 4: Refinement-based tool flow in SCE.
3.1. Architecture Exploration

The first step in the SCE design flow, architecture exploration, defines the target platform and, under a set of design constraints, maps the computational parts of the specification model onto that platform. The target architecture consists of a set of PEs, that is, software processors, custom hardware blocks, and memories. These components are selected by the system designer as part of the decision making. In particular, the designer selects the type and the number of PEs, CEs, and communication busses.

Architecture exploration consists of two tasks: PE allocation and partitioning. PE allocation defines the target architecture by selecting system components (software and hardware processors, memories) from the PE database. Partitioning then maps behaviors and variables to the allocated PEs and memories, respectively.

Following the design decisions of PE allocation and partitioning, the SCE architecture refinement tool inserts an additional layer of hierarchy representing the PEs into the model and groups behaviors and variables under these according to the partitioning. Next, it refines given complex channels into a client-server implementation using message-passing communication between the PEs and inserts necessary synchronization to properly preserve the original execution semantics. Finally, automatically generates the output architecture model [27].

3.2. Scheduling Exploration

A key feature in the SCE design flow is the early evaluation of different scheduling strategies for software processors that are sequential and physically can only execute one task at a time. To evaluate different static and dynamic scheduling algorithms, such as round-robin or priority-based scheduling, we utilize a high-level RTOS model on each processor in the system [28]. Our abstract RTOS model is written on top of the SpecC language and does not require any specific language extensions. It supports all the key concepts found in modern RTOS, including task management, real-time scheduling, preemption, task synchronization, and interrupt handling.

After the designer chooses the desired scheduling strategy (e.g., round-robin, priority-based, or first-come-first-served), the SCE scheduling refinement tool automatically groups the given behaviors in the software PE into tasks and inserts the RTOS model with the user-defined scheduling strategy into the design model. then wraps all primitives and events that can trigger scheduling, such as task activation and termination, IPC synchronization and communication, and timing wait statements so that the inserted RTOS is called. It finally generates the refined model that can then be simulated for accurate observation and evaluation of dynamic scheduling behavior in the multitasking system. Since our abstract RTOS model requires only minimal overhead in simulation time, this approach enables early and rapid design space exploration.

3.3. Network Exploration

Network exploration defines the system communication topology and maps the given communication channels onto a network of busses and communication elements (CEs), that is, bridges and transducers. For this, network refinement inserts the required CEs from the database into the model and implements the end-to-end communication over point-to-point links between PEs and CEs [29].

In the input architecture model, PEs communicate via abstract, typed end-to-end channels, and memory interfaces. During network exploration, the user allocates the actual communication media, bridges, and transducers for the system busses and CEs, respectively. Furthermore, the designer defines the connectivity of PE and CE ports to the busses, and maps architecture-level end-to-end channels onto the allocated bus network.

Based on the network decisions by the designer, the SCE network refinement tool inserts and implements the ISO/OSI presentation, network and transport layers, which implement data conversion, packeting, and routing; and acknowledgements, respectively. then generates the new network model such that it reflects the selected network topology including typed end-to-end architecture level communication over untyped point-to-point links between the components in each network segment.

3.4. Communication Synthesis

Next, the task of communication synthesis is to implement the point-to-point logical links between stations over the actual bus media, and to select and define the final pin- and bit-accurate parameters of the communication architecture under a set of constraints. Communication refinement then inserts protocols and bus-functional component descriptions from the bus and PE/CE databases, respectively, and generates a refined communication model that implements the communication links in each network segment over the actual, shared bus protocol and bus wires. In addition to this pin-accurate model (PAM), our communication refinement also generates a fast-simulating TLM of the system, which abstracts away the pin-level details of individual bus transactions [29].

In the input network model, communication in each network segment is described as a set of logical links. During communication synthesis, the designer (through the GUI, scripting or using synthesis plugins) defines the bus parameters, such as address and interrupt assignments, for each logical link over each bus. Based on these decisions, the SCE communication refinement tool inserts low-level (transaction-level down to pin-accurate) models of busses and components from the databases, and generates a new communication model (PAM or TLM) of the design. In the output model, PE and CE components are refined to implement the lower communication layers (link, stream, media access, and protocol layer) for synchronization, addressing, and media accesses over each bus interface. On top of bus models from the bus database, the generated model hence implements all system communication down to the level of timing-accurate bus transactions (TLM), or cycle-accurate events for sampling and driving of the bus wires (PAM).

3.5. RTL Synthesis

The task of RTL synthesis is to generate structural RTL from the behavioral description of the hardware components in the design. Although the designer can freely choose all behavioral synthesis parameters, including scheduling, allocation, and binding decisions, the SCE RTL synthesis tool supports automatic decision making through plugins. The designer can choose an algorithm to apply to all or only parts of their design. Critical parts of the design, on the other hand, can be manually preassigned or postoptimized [30].

Both designers and algorithms can rely on a set of estimates to aid them in the decision making. SCE includes RTL-specific profiling and analysis tools that provide feedback about a variety of metrics including delay, power, and variable lifetimes.

RTL synthesis in SCE takes full advantage of the designers' insight by allowing them to enter, modify, or override their decisions at will. On the other hand, tedious and error-prone tasks including code generation are automated.

3.6. Software Synthesis

For implementing the software components in the system model, SCE relies on a layer-based modeling of the programmable processors and the software stack executing on them. Our embedded processor model supports task scheduling and interrupt handling.

Given scheduling priorities defined by the system designer, the SCE software synthesis tool sc2c automatically generates embedded software code for each processor from the system model [31]. More specifically, we generate efficient ANSI-C code from the SLDL code of the mapped application, and compile and link it against the selected RTOS. The resulting software binary can then be used for cycle-accurate instruction-set simulation within the system model, as well as for the final implementation.

4. Experiments and Results

We have applied SCE to a large set of industrial-strength examples. In the following, we will first demonstrate the SCE design flow in detail as applied to a case study. Next, we summarize our experiences with different examples and show exploration results. Finally, we will present a set of verification experiments.

4.1. Modeling Experiment

In order to demonstrate the overall SCE design flow, we have applied the flow to the example of a mobile phone baseband platform. The specification model of the system is shown in Figure 5. The design combines a JPEG encoder for processing of digital pictures taken by a camera and a voice encoder/decoder (vocoder) for speech processing based on the mobile phone GSM standard. Both JPEG and Vocoder processes are hierarchically composed of subbehaviors implementing the encoding and decoding algorithms in nested and pipelined loops and communicating through abstract message-passing channels. At the top level, a channel Ctrl between the two processes is used to send control messages from the JPEG encoder to the vocoder.

Figure 5: Baseband example: specification model.

For the target platform (for space reasons, we do not show the platform model separately; the model is almost identical to Figure 6, with the exception that the OS layer and OS channel are omitted), we decide to use two software processors assisted by several hardware accelerators. For the JPEG encoder, we select a Motorola Coldfire processor for the main execution, assisted by a special IP component DCT_IP which performs the needed discrete cosine transformation (DCT) in hardware. We also choose a direct memory access component DMA that receives pixel stripes from the camera and puts them into a shared memory Mem. On the other hand, we select a digital signal processor DSP to perform the majority of the voice encoding and decoding tasks. To reach the required performance, the DSP is assisted by four hardware blocks dedicated to input and output of the data streams, and one custom coprocessor in charge of the codebook search, the most time-critical function in the vocoder.

Figure 6: Baseband example: scheduled architecture model.

In the scheduled model obtained after architecture partitioning and scheduling (Figure 6), the ColdFire processor runs the JPEG encoder in software assisted by the hardware DCT_IP. Since this processor only executes this one task, no operating system is needed and the OS layer CF_OS is empty. On the other hand, the DSP performs two concurrent speech encoding and decoding tasks. These tasks are dynamically scheduled under the control of a priority-based operating system model that sits in an additional OS layer DSP_OS around the DSP. The encoder on the DSP is assisted by a custom hardware coprocessor (HW) for the codebook search. Furthermore, four custom hardware I/O processors perform buffering and framing of the vocoder speech and bit streams.

Table 1 summarizes the design decisions made for implementing the communication channels in the example. As a result of the network exploration, the network is partitioned into one segment per subsystem with a transducer Tx connecting the two segments (Figure 7). Individual point-to-point logical links connect each pair of stations in the resulting network model. Application channels are routed statically over these links where the Ctrl channel spanning the two subsystems is routed over two links via the intermediate transducer.

Table 1: Communication design parameters for baseband example.
Figure 7: Baseband example: network model.

During communication synthesis, all links within each subsystem are implemented over a single shared medium. In both cases, the native ColdFire and DSP processor busses are selected as communication media. Within the segments, unique bus addresses and interrupts for synchronization are assigned to each link. On the ColdFire side, the memory is assigned a range of addresses with a base address plus offsets for each stored variable. On the DSP side, two of the four available interrupts are shared among the four I/O processors. In those cases, additional bus addresses for slave polling are assigned to each link (base address plus one). Finally, a bridge DCT_Br is inserted to translate between the DCT_IP and ColdFire bus protocols.

As a result, SCE communication synthesis generates two models, a fast-simulating TLM (Figure 8), and a pin-accurate model (PAM, Figure 9) for further implementation. In the TLM, link, stream, and media access layers are instantiated inside the OS and hardware layers of each station. Inside the processors, interrupt handlers that communicate with link layer adapters through semaphores are created. Interrupt service routines (ISR) together with models of programmable interrupt controllers (PIC) model the processor's interrupt behavior and invoke the corresponding handlers when triggered.

Figure 8: Baseband example: transaction-level model (TLM).
Figure 9: Baseband example: pin-accurate model (PAM).

In the PAM, additionally the communication protocol layers are instantiated. Components are connected via pins and wires driven by the protocol layer adapters. On the ColdFire side, an additional arbiter component regulates bus accesses between the two masters, DMA_BF and CF_BF.

Table 2 summarizes the results for the example design. Using the refinement tools, models of the example design were automatically generated within seconds. A testbench common to all models was created which exercises the design by simultaneously encoding and decoding 163 frames of speech on the vocoder side while performing JPEG encoding of 30 pictures with pixels. We created and refined both models of the whole system and models of each subsystem separately. Note that code sizes (lines of code, LOC) in each case include the testbenches. Since testbench code is shared, the size of the system model is less than the sum of the subsystem model sizes. All models were simulated on a 2.7 GHz Linux workstation using the QuickThreads version of the SpecC simulator.

Table 2: Modeling and simulation results for baseband example.

Figure 10 plots simulation times on a logarithmic scale, that is, the graph shows that simulation times generally grow exponentially with each new model at the next lower level of abstraction. On the other hand, results of simulated overall frame transcoding (back-to-back encoding and decoding) and picture encoding delays in the vocoder and JPEG encoder, respectively, are shown in Figure 11. As can be seen, with each new model, measured delays linearly converge towards the final result.

Figure 10: Simulation speeds for the baseband example.
Figure 11: Simulated delays in the baseband example.

Note that initial specification models are untimed and hence do not provide any delay measurements at all. Beginning with the architecture level, estimated execution delays are back-annotated into the computation blocks. As expected, scheduling has a large effect on simulation accuracy where abstract OS modeling enables evaluation of scheduling decisions at native simulation speeds (note that since the amount of simulated parallelism decreases, simulation is potentially even faster than at the specification level). Depending on the relation of communication versus computation, introducing bus models and communication delays at the transaction-level further increases accuracy, potentially at the cost of significantly longer simulation times. On the other hand, TLMs allow for accurate modeling of communication close or equivalent to pin-accurate models but at higher speed.

Our results show that with increasing implementation detail at lower levels of abstraction, accuracy (as measured by the simulated delays) improves linearly while model complexities (as measured by code sizes and simulation times) grow exponentially. All in all, our results support the choice of intermediate models in the design flow that allows for fast validation of critical design aspects at early stages of the design process.

4.2. Exploration Experiments

In order to demonstrate our approach in terms of design space exploration for a wide variety of designs, we applied SCE to the design of six industrial-strength examples: stand-alone versions of the JPEG encoder (JPEG) and the GSM voice codec (Vocoder), floating- and fixed-point versions of an MP3 decoder (MP3float and MP3fix), the previously introduced baseband example (Baseband), and a Cellphone example combining the JPEG encoder, the MP3 decoder, and the GSM vocoder in a platform mimicking the one used in the RAZR cellphone. For each example, we generated different architectures using Motorola DSP56600 (DSP), Motorola ColdFire (CF), and ARM7TDMI (ARM) processors together with custom hardware coprocessors (HW, DCT) and I/O units. We used various communication architectures with DSP, CF, ARM (AMBA AHB), and simple handshake busses.

Table 3 summarizes the features and parameters of the different design examples we tested. For each example, the target architectures are specified as a list of masters plus slaves for each bus in the system where the bus type is implicitly determined to be the protocol of the primary master on the bus. For example, in the case of the MP3float design, the ColdFire processor communicates with dedicated hardware units over its CF bus whereas the HW units communicate with each other through separate handshake busses. For simplicity, routing, address, and interrupt assignment decisions are not shown in this table.

Table 3: Design examples and target architectures.

Table 4 shows the results of exploration of the design space for the different examples. Overall model complexities are given in terms of code size using lines of code (LOC) as a metric. Results show significant differences in complexity between input and generated output models due to extra implementation detail added between abstraction levels.

Table 4: Results for exploration experiments.

Note that manual refinement would require tremendous effort (in the order of days). Automatic refinement, on the other hand, completes in the order of seconds. Our results therefore show that a significant productivity gain can be achieved using SCE with automatic model refinement.

4.3. Verification Experiments

We implemented the SCE equivalence verification tool scver to verify the refinements above network level. Since the lowest abstraction level of communication in model algebra is the channel, models below network level in the SCE flow could not be directly translated into model algebraic representation.

The results for verification of architecture, scheduling, and network refinements are presented in Table 5. We used two benchmarks, namely, the JPEG encoder and Vocoder as shown in column 1. The model algebraic representation was stored in a graph data structure, with nodes being the objects and edges being the composition rules. Column 5 shows the total transformations applied to derive model 1 from model 2 using the transformation rules of model algebra. As we can see, since the order of transformation is decided, it only took a few seconds to apply them even for representations with hundreds of nodes and edges. The verification time also includes the time it took to parse the SpecC models into model algebraic representation and to perform isomorphism checking between the derived and original model graphs.

Table 5: Results for equivalence verification.

The results demonstrate that the SCE tool flow based on well-defined model abstractions and semantics enables fast equivalence verification.

5. Summary and Conclusion

In this work, we have presented SCE, a comprehensive system design framework based on the SpecC language. SCE supports a wide range of heterogeneous target platforms consisting of custom hardware components, embedded software processors, dedicated IP blocks, and complex communication bus architectures.

The SCE design flow is based on a series of automated model refinement steps where the system designer makes the decisions and SCE quickly provides estimation feedback, generates new models automatically, and validates them through simulation and formal verification. The effective design automation tools integrated in SCE allow rapid and extensive design space exploration. The fast exploration capabilities, in turn, enable the designer to optimize the system architecture, the scheduling policies, the communication network, and the hardware and software components, so that an optimal implementation is reached quickly.

We have demonstrated the benefits of SCE by use of six industrial-size examples with varying target architectures, which have been designed and verified top-to-bottom. Compared to manual coding and model refinement, SCE achieves productivity gains by orders of magnitude.

SCE has been successfully transferred to and applied in industrial settings. SER, a commercial derivative of SCE, has been developed and integrated into ELEGANT, an environment for electronic system-level (ESL) design of space and satellite electronics that was commissioned by the Japanese Aerospace Exploration Agency (JAXA). ELEGANT and SER have been succesfully delivered to JAXA's suppliers and are currently being introduced into the general market [32].

Acknowledgments

The authors would like to thank all members of the CECS SpecC group who have contributed to SCE over the years. Special thanks go to David Berner, Pramod Chandraiah, Quoc-Viet Dang, Alexander Gluhak, Eric Johnson, Raphael Lopez, Gunar Schirner, Ines Viskic, Shuqing Zhao, and Jianwen Zhu.

References

  1. D. D. Gajski, J. Zhu, R. Dömer, A. Gerstlauer, and S. Zhao, SpecC: Specification Language and Design Methodology, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2000.
  2. D. D. Gajski, F. Vahid, S. Narayan, and J. Gong, Specification and Design of Embedded Systems, Prentice Hall, Upper Saddle River, NJ, USA, 1994.
  3. P. Coste, F. Hessel, Ph. Le Marrec, et al., “Multilanguage design of heterogeneous systems,” in Proceedings of the 7th International Workshop on Hardware/Software Codesign (CODES '99), pp. 54–58, Rome, Italy, May 1999.
  4. P. Gerin, S. Yoo, G. Nicolescu, and A. A. Jerraya, “Scalable and flexible cosimulation of SoC designs with heterogeneous multi-processor target architectures,” in Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC '01), pp. 63–68, Yokohama, Japan, January-February 2001.
  5. ModelSim SE User's Manual, Mentor Graphics Corp.
  6. J. Buck, S. Ha, E. A. Lee, and D. G. Messerschmitt, “Ptolemy: a framework for simulating and prototyping heterogeneous systems,” International Journal of Computer Simulation, vol. 4, no. 2, pp. 155–182, 1994.
  7. T. Grötker, S. Liao, G. Martin, and S. Swan, System Design with SystemC, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2002.
  8. M. Aubury, I. Page, G. Randall, J. Saul, and R. Watts, Handel-C language reference guide, Oxford University Computing Laboratory, Oxford, UK, August 1996.
  9. F. Ghenassia, Transaction-Level Modeling with SystemC: TLM Concepts and Applications for Embedded Systems, Springer, New York, NY, USA, 2005.
  10. A. Österling, T. Brenner, R. Ernst, D. Herrmann, T. Scholz, and W. Ye, “The COSYMA system,” in Hardware/Software Co-Design: Principles and Practice, J. Staunstrup and W. Wolf, Eds., Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997.
  11. C. A. Valderrama, M. Romdhani, J.-M. Daveau, G. F. Marchioro, A. Changuel, and A. A. Jerraya, “Cosmos: a transformational co-design tool for multiprocessor architectures,” in Hardware/Software Co-Design: Principles and Practice, J. Staunstrup and W. Wolf, Eds., Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997.
  12. F. Balarin, M. Chiodo, P. Giusto, et al., Hardware-Software Co-Design of Embedded Systems: The POLIS Approach, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997.
  13. G. Vanmeerbeeck, P. Schaumont, S. Vernalde, M. Engels, and I. Bolsens, “Hardware/software partitioning for embedded systems in OCAPI-xl,” in Proceedings of the International Symposium on Hardware-Software Codesign (CODES '01), Copenhagen, Denmark, April 2001.
  14. P. Schaumont, S. Vernalde, L. Rijnders, M. Engels, and I. Bolsens, “A programming environment for the design of complex high speed ASICs,” in Proceedings of the 35th Annual Conference on Design Automation (DAC '98), pp. 315–320, San Francisco, Calif, USA, June 1998.
  15. K. Grüttner, F. Oppenheimer, W. Nebel, A.-M. Fouilliart, and F. Colas-Bigey, “SystemC-based modelling, seamless refinement, and synthesis of a JPEG 2000 decoder,” in Proceedings of the Design, Automation and Test in Europe Conference (DATE '08), pp. 128–133, Munich, Germany, March 2008.
  16. W. O. Cesário, D. Lyonnard, G. Nicolescu, et al., “Multiprocessor SoC platforms: a component-based design approach,” IEEE Design and Test of Computers, vol. 19, no. 6, pp. 52–63, 2002.
  17. D. Lyonnard, S. Yoo, A. Baghdadi, and A. A. Jerraya, “Automatic generation of application-specific architectures for heterogeneous multiprocessor system-on-chip,” in Proceedings of the 38th Annual Conference on Design Automation (DAC '01), pp. 518–523, Las Vegas, Nev, USA, June 2001.
  18. K. van Rompaey, I. Bolsens, H. De Man, and D. Verkest, “CoWare—a design environment for heterogeneous hardware/software systems,” in Proceedings of the European Design Automation Conference (EURO-DAC '96), pp. 252–257, Geneva, Switzerland, September 1996.
  19. W. Klingauf, H. Gädke, and R. Günzel, “TRAIN: a virtual transaction layer architecture for TLM-based HW/SW codesign of synthesizable MPSoC,” in Proceedings of the Design, Automation and Test in Europe Conference (DATE '06), vol. 1, Munich, Germany, March 2006.
  20. T. Kempf, M. Doerper, R. Leupers, et al., “A modular simulation framework for spatial and temporal task mapping onto multi-processor SoC platforms,” in Proceedings of the Design, Automation and Test in Europe Conference (DATE '05), vol. 2, pp. 876–881, Munich, Germany, March 2005.
  21. F. Balarin, Y. Watanabe, H. Hsieh, L. Lavagno, C. Passerone, and A. Sangiovanni-Vincentelli, “Metropolis: an integrated electronic system design environment,” Computer, vol. 36, no. 4, pp. 45–52, 2003.
  22. A. L. Sangiovanni-Vincentelli, “Quo vadis SLD: reasoning about trends and challenges of system-level design,” Proceedings of the IEEE, vol. 95, no. 3, pp. 467–506, 2007.
  23. S. Abdi, J. Peng, H. Yu, et al., “System-on-chip environment (SCE version 2.2.0 beta): tutorial,” Center for Embedded Computer Systems, University of California, Irvine, Calif, USA, July 2003.
  24. L. Cai, A. Gerstlauer, and D. Gajski, “Retargetable profiling for rapid, early system-level design space exploration,” in Proceedings of the 41st Annual Conference on Design Automation (DAC '04), pp. 281–286, San Diego, Calif, USA, June 2004.
  25. S. Abdi and D. Gajski, “Verification of system level model transformations,” International Journal of Parallel Programming, vol. 34, no. 1, pp. 29–59, 2006.
  26. A. Gerstlauer, L. Cai, D. Shin, H. Yu, J. Peng, and R. Dömer, SCE database reference manual, version 2.2.0 beta, Center for Embedded Computer Systems, University of California, Irvine, Calif, USA, July 2003.
  27. J. Peng and D. Gajski, “Optimal message-passing for data coherency in distributed architecture,” in Proceedings of the 15th International Symposium on System Synthesis, pp. 20–25, Kyoto, Japan, October 2002.
  28. A. Gerstlauer, H. Yu, and D. D. Gajski, “Rtos modeling for system level design,” in Proceedings of the Design, Automation and Test in Europe Conference (DATE '03), Munich, Germany, March 2003.
  29. A. Gerstlauer, D. Shin, J. Peng, R. Dömer, and D. D. Gajski, “Automatic layer-based generation of system-on-chip bus communication models,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 9, pp. 1676–1687, 2007.
  30. D. Shin, A. Gerstlauer, R. Dömer, and D. D. Gajski, “An interactive design environment for C-based high-level synthesis of RTL processors,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 16, no. 4, pp. 466–475, 2008.
  31. H. Yu, R. Dömer, and D. Gajski, “Embedded software generation from system level design languages,” in Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC '04), pp. 463–468, Yokohama, Japan, January 2004.
  32. CECS eNews Volume 7, Issue 3, Center for Embedded Computer Systems, University of California, Irvine, Calif, USA, July 2007, http://www.cecs.uci.edu/enews/CECSeNewsJul07.pdf.