﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>VLSI Design</title><link>http://www.hindawi.com</link><description>The latest articles from Hindawi Publishing Corporation</description><copyright>&amp;#169; 2012, Hindawi Publishing Corporation. All rights reserved.</copyright><item><title>An Empirical Investigation on System and Statement Level Parallelism Strategies for Accelerating Scatter Search Using Handel-C and Impulse-C</title><link>http://www.hindawi.com/journals/vlsi/2012/793196/</link><description>Scatter Search is an effective and established population-based metaheuristic that has been used to solve a variety of hard optimization problems. However, the time required to find high-quality solutions can become prohibitive as problem sizes grow. In this paper, we present a hardware implementation of Scatter Search on a field-programmable gate array (FPGA). Our objective is to improve the run time of Scatter Search by exploiting the potentially massive performance benefits that are available through the native parallelism in hardware. When implementing Scatter Search we employ two different high-level languages (HLLs): Handel-C and Impulse-C. Our empirical results show that by effectively exploiting source-code optimizations, data parallelism, and pipelining, a 28x speed up over software can be achieved.</description><Author>M. Walton, O. Ahmed, G. Grewal, and S. Areibi</Author><copyright>Copyright &amp;#xa9; 2012 M. Walton et al. All rights reserved.</copyright></item><item><title>Test Generation for Crosstalk-Induced Delay Faults in VLSI Circuits Using Modified FAN Algorithm</title><link>http://www.hindawi.com/journals/vlsi/2012/745861/</link><description>As design trends move toward nanometer technology, new problems due to noise effects lead to a decrease in reliability and performance of VLSI circuits. Crosstalk is one such noise effect which affects the timing behaviour of circuits. In this paper, an efficient Automatic Test Pattern Generation (ATPG) method based on a modified Fanout Oriented (FAN) to detect crosstalk-induced delay faults in VLSI circuits is presented. Tests are generated for ISCAS_85 and enhanced scan version of ISCAS_89 benchmark circuits. Experimental results demonstrate that the test program gives better fault coverage, less number of backtracks, and hence reduced test generation time for most of the benchmark circuits when compared to modified Path-Oriented Decision Making (PODEM) based ATPG. The number of transitions is also reduced thus reducing the power dissipation of the circuit.</description><Author>S. Jayanthy, M. C. Bhuvaneswari, and Keesarapalli Sujitha</Author><copyright>Copyright &amp;#xa9; 2012 S. Jayanthy et al. All rights reserved.</copyright></item><item><title>A Methodology for Generation of Performance Models for the Sizing of Analog High-Level Topologies</title><link>http://www.hindawi.com/journals/vlsi/2011/475952/</link><description>This paper presents a systematic methodology for the generation of high-level performance models for analog component blocks. The transistor sizes of the circuit-level implementations of the component
blocks along with a set of geometry constraints applied over them define the sample space. A Halton
sequence generator is used as a sampling algorithm. Performance data are generated by simulating
each sampled circuit configuration through SPICE. Least squares support vector machine (LS-SVM) is
used as a regression function. Optimal values of the model hyper parameters are determined through a
grid search-based technique and a genetic algorithm- (GA-) based technique. The high-level models of the
individual component blocks are combined analytically to construct the high-level model of a complete
system. The constructed performance models have been used to implement a GA-based high-level topology
sizing process. The advantages of the present methodology are that the constructed models are accurate
with respect to real circuit-level simulation results, fast to evaluate, and have a good generalization
ability. In addition, the model construction time is low and the construction process does not require
any detailed knowledge of circuit design. The entire methodology has been demonstrated with a set
of numerical results.</description><Author>Soumya Pandit, Chittaranjan Mandal, and Amit Patra</Author><copyright>Copyright &amp;#xa9; 2011 Soumya Pandit et al. All rights reserved.</copyright></item><item><title>A High-Throughput, High-Accuracy System-Level Simulation Framework for System on Chips</title><link>http://www.hindawi.com/journals/vlsi/2011/726014/</link><description>Today&amp;#39;s System-on-Chips (SoCs) design is extremely challenging because it involves complicated design tradeoffs and heterogeneous design expertise. To explore the large solution space, system architects have to rely on system-level simulators to identify an optimized SoC architecture. In this paper, we propose a system-level simulation framework, System Performance Simulation Implementation Mechanism, or SPSIM. Based on SystemC TLM2.0, the framework consists of an executable SoC model, a simulation tool chain, and a modeling methodology. Compared with the large body of existing research in this area, this work is aimed at delivering a high simulation throughput and, at the same time, guaranteeing a high accuracy on real industrial applications. Integrating the leading TLM techniques, our simulator can attain a simulation speed that is not slower than that of the hardware execution by a factor of 35 on a set of real-world applications. SPSIM incorporates effective timing models, which can achieve a high accuracy after hardware-based calibration. Experimental results on a set of mobile applications proved that the difference between the simulated and measured results of timing performance is within 10&amp;#37;, which in the past can only be attained by cycle-accurate models.</description><Author>Guanyi Sun, Shengnan Xu, Xu Wang, Dawei Wang, Eugene Tang, Yangdong Deng, and Sun Chan</Author><copyright>Copyright &amp;#xa9; 2011 Guanyi Sun et al. All rights reserved.</copyright></item><item><title>Vertical Gate RF SOI LIGBT for SPICs with Significantly Improved Latch-Up Immunity</title><link>http://www.hindawi.com/journals/vlsi/2011/548546/</link><description>Based on the previous achievements in improving latch-up immunity of SOI LIGBT, process simulation on our proposed VG RF SOI NLIGBT was carried out with TCAD to provide a virtually fabricated device structure. Then, an approximate latching current model was derived according to the condition of minimum regenerative feedback couple between the parasitic dual-transistors. The model indicates that its latching current is a few orders higher than those before. Further verification through device simulation was done with TCAD, which proved that its weak snapback voltage in the off state is about 0.5&amp;#8211;2.75 times higher than those breakdown voltages reported before, its breakdown voltage in the off state is about 19&amp;#x2009;V higher than its weak snapback voltage, and its latching current density in the on state is about 2-3 orders of magnitude higher than those reported before at room temperature due to hole current bypass through P+ contact in P-well region. Therefore, it is characterized by significantly improved latch-up immunity.</description><Author>Haipeng Zhang, Ruisheng Qi, Liang Zhang, Buchun Su, and Dejun Wang</Author><copyright>Copyright &amp;#xa9; 2011 Haipeng Zhang et al. All rights reserved.</copyright></item><item><title>Lossless and Low-Power Image Compressor for Wireless Capsule Endoscopy</title><link>http://www.hindawi.com/journals/vlsi/2011/343787/</link><description>We present a lossless and low-complexity image compression algorithm for endoscopic images. The algorithm consists of a static prediction scheme and a combination of golomb-rice and unary encoding. It does not require any buffer memory and is suitable to work with any commercial low-power image sensors that output image pixels in raster-scan fashion. The proposed lossless algorithm has compression ratio of approximately 73&amp;#37; for endoscopic images. Compared to the existing lossless compression standard such as JPEG-LS, the proposed scheme has better compression ratio, lower computational complexity, and lesser memory requirement. The algorithm is implemented in a 0.18&amp;#x2009;&amp;#x03BC;m CMOS technology and consumes 0.16&amp;#x2009;mm &amp;#x00D7; 0.16&amp;#x2009;mm silicon area and 18&amp;#x2009;&amp;#x03BC;W of power when working at 2 frames per second.</description><Author>Tareq Hasan Khan and Khan A. Wahid</Author><copyright>Copyright &amp;#xa9; 2011 Tareq Hasan Khan and Khan A.  Wahid. All rights reserved.</copyright></item><item><title>Advancement in Nanoscale CMOS Device Design En Route to Ultra-Low-Power Applications</title><link>http://www.hindawi.com/journals/vlsi/2011/178516/</link><description>In recent years, the demand for power sensitive designs has grown significantly due to the fast growth of battery-operated portable applications. As the technology scaling continues unabated, subthreshold device design has gained a lot of attention due to the low-power and ultra-low-power consumption in various applications. Design of low-power high-performance submicron and deep submicron CMOS devices and circuits is a big challenge. Short-channel effect is a major challenge for scaling the gate length down and below 0.1&amp;#x2009;&amp;#x003BC;m. Detailed review and potential solutions for prolonging CMOS as the leading information technology proposed by various researchers in the past two decades are presented in this paper. This paper attempts to categorize the challenges and solutions for low-power and low-voltage application and thus provides a roadmap for device designers working in the submicron and deep submicron region of CMOS devices separately.</description><Author>Subhra Dhar, Manisha Pattanaik, and Poolla Rajaram</Author><copyright>Copyright &amp;#xa9; 2011 Subhra Dhar et al. All rights reserved.</copyright></item><item><title>CONTANGO: Integrated Optimization of SoC Clock Networks</title><link>http://www.hindawi.com/journals/vlsi/2011/407507/</link><description>On-chip clock networks are remarkable in their impact on the performance and power of synchronous circuits, in their susceptibility to adverse effects of semiconductor technology scaling, as well as in their strong potential for improvement through better CAD algorithms and tools. Existing literature is rich in ideas and techniques but performs large-scale optimization using analytical models that lost accuracy at recent technology nodes and have rarely been validated by realistic
SPICE simulations on large industry designs. Our work offers a methodology for SPICE-accurate optimization
of clock networks, coordinated to satisfy slew constraints and achieve best tradeoffs between skew, insertion delay, power,
as well as tolerance to variations. Our implementation, called Contango, is evaluated on 45&amp;#x2009;nm benchmarks from IBM Research and Texas Instruments with up to 50&amp;#x2009;K sinks. It outperforms all published results in terms of skew and shows superior scalability.</description><Author>Dong-Jin Lee and Igor L. Markov</Author><copyright>Copyright &amp;#xa9; 2011 Dong-Jin Lee and Igor L. Markov. All rights reserved.</copyright></item><item><title>New Considerations for Spectral Classification of Boolean Switching  Functions</title><link>http://www.hindawi.com/journals/vlsi/2011/356137/</link><description>This paper presents some new considerations for spectral techniques for classification of Boolean functions. These considerations incorporate discussions of the feasibility of extending this classification technique beyond n=5. A new implementation is presented along with a basic analysis of the complexity of the problem. We also note a correction to results in this area that were reported in previous work.</description><Author>J. E. Rice, J. C. Muzio, and N. Anderson</Author><copyright>Copyright &amp;#xa9; 2011 J. E. Rice et al. All rights reserved.</copyright></item><item><title>A Cost-Effective 10-Bit D/A Converter for Digital-Input MOEMS Micromirror Actuation</title><link>http://www.hindawi.com/journals/vlsi/2010/169079/</link><description>The design of a 10-bit resistor-string digital-to-analog converter (DAC) for MOEMS micromirror interfacing is addressed in this paper. The proposed DAC, realized in a 0.18-&amp;#x3bc;m BCD technology, features a folded resistor-string stage with a switch matrix and address decoders plus an output voltage buffer stage. The proposed DAC and buffer circuitry are key elements of an innovative scanning micromirror actuator, characterized by direct digital input, full differential driving, and linear response. With respect to the the state-of-the-art resistor-string converters in similar technologies, the proposed DAC has comparable nonlinearity (INL, DNL) performances while it has the advantage of a smaller area occupation, 0.17&amp;#x2009;mm2, including output buffer, and relatively low-power consumption, 200&amp;#x2009;&amp;#x3bc;W at 500&amp;#x2009;kSPS and few &amp;#x3bc;W in idle mode.</description><Author>Sergio Saponara, Tommaso Baldetti, and Luca Fanucci</Author><copyright>Copyright &amp;#xa9; 2010 Sergio Saponara et al. All rights reserved.</copyright></item><item><title>An Approach for Implementing State Machines with Online Testability</title><link>http://www.hindawi.com/journals/vlsi/2010/639747/</link><description>During the last two decades, significant amount of research has been performed to simplify the detection of transient or soft errors in VLSI-based digital systems. This paper proposes an approach for implementing state machines that uses 2-hot code for state encoding. State machines designed using this approach allow online detection of soft errors in registers and output logic. The 2-hot code considerably reduces the number of required flip-flops and leads to relatively straightforward implementation of next state and output logic. A new way of designing output logic for online fault detection has also been presented.</description><Author>P. K. Lala, A. Mathews, and J. P. Parkerson</Author><copyright>Copyright &amp;#x00A9; 2010 P. K. Lala et al. All rights reserved.</copyright></item><item><title>Simple Exact Algorithm for Transistor Sizing of Low-Power High-Speed Arithmetic Circuits</title><link>http://www.hindawi.com/journals/vlsi/2010/264390/</link><description>A new transistor sizing algorithm, SEA (Simple Exact Algorithm), for optimizing low-power and high-speed arithmetic integrated circuits is proposed. In comparison with other transistor sizing algorithms, simplicity, accuracy, independency of order and initial sizing factors of transistors, and flexibility in choosing the optimization parameters such as power consumption, delay, Power-Delay Product (PDP), chip area or the combination of them are considered as the advantages of this new algorithm. More exhaustive rules of grouping transistors are the main trait of our algorithm. Hence, the SEA algorithm dominates some major transistor sizing metrics such as optimization rate, simulation speed, and reliability. According to approximate comparison of the SEA algorithm with MDE and ADC for a number of conventional full adder circuits, delay and PDP have been improved 55.01&amp;#37; and 57.92&amp;#37; on an average, respectively. By comparing the SEA and Chang&amp;#39;s algorithm, 25.64&amp;#37; improvement in PDP and 33.16&amp;#37; improvement in delay have been achieved. All the simulations have been performed with 0.13&amp;#x02009;&amp;#x03BC;m technology based on the BSIM3v3 model using HSpice simulator software.</description><Author>Tooraj Nikoubin, Poona Bahrebar, Sara Pouri, Keivan Navi, and Vaez Iravani</Author><copyright>Copyright &amp;#x00A9; 2010 Tooraj Nikoubin et al. All rights reserved.</copyright></item><item><title>CORDIC Architectures: A Survey</title><link>http://www.hindawi.com/journals/vlsi/2010/794891/</link><description>In the last decade, CORDIC algorithm has drawn wide attention from academia and industry
for various applications such as DSP, biomedical signal processing, software defined radio, neural
networks, and MIMO systems to mention just a few. It is an iterative algorithm, requiring simple
shift and addition operations, for hardware realization of basic elementary functions. Since
CORDIC is used as a building block in various single chip solutions, the critical aspects to be
considered are high speed, low power, and low area, for achieving reasonable overall performance.
In this paper, we first classify the CORDIC algorithm based on the number system and discuss
its importance in the implementation of CORDIC algorithm. Then, we present systematic and
comprehensive taxonomy of rotational CORDIC algorithms, which are subsequently discussed
in depth. Special attention has been devoted to the higher radix and flat techniques proposed
in the literature for reducing the latency. Finally, detailed comparison of various algorithms is
presented, which can provide a first-order information to designers looking for either further
improvement of performance or selection of rotational CORDIC for a specific application.</description><Author>B. Lakshmi and A. S. Dhar</Author><copyright>Copyright &amp;#x00A9; 2010 B. Lakshmi and A. S. Dhar. All rights reserved.</copyright></item><item><title>Run-Length-Based Test Data Compression Techniques: How Far from Entropy and Power Bounds?&amp;#8212;A Survey</title><link>http://www.hindawi.com/journals/vlsi/2010/670476/</link><description>The run length based coding schemes have been very effective for
the test data compression in case of current generation SoCs with
a large number of IP cores. The first part of paper presents a
survey of the run length based codes. The data compression of any
partially specified test data depends upon how the unspecified
bits are filled with 1s and 0s. In the second part of the paper,
the five different approaches for &amp;#8220;don&amp;#39;t care&amp;#8221; bit
filling based on nature of runs are proposed to predict the
maximum compression based on entropy. Here the various run length
based schemes are compared with maximum data compression limit
based on entropy bounds. The actual compressions claimed by the
authors are also compared. For various ISCAS circuits, it has been
shown that when the X filling is done considering runs of zeros
followed by one as well as runs of ones followed by zero (i.e.,
Extended FDR), it provides the maximum data compression. In third
part, it has been shown that the average test power and peak power
is minimum when the don&amp;#39;t care bits are filled to make the
long runs of 0s as well as 1s.</description><Author>Usha S. Mehta, Kankar S. Dasgupta, and Niranjan M. Devashrayee</Author><copyright>Copyright &amp;#x00A9; 2010 Usha S. Mehta et al. All rights reserved.</copyright></item><item><title>Reduced Voltage Scaling in Clock Distribution Networks</title><link>http://www.hindawi.com/journals/vlsi/2009/679853/</link><description>We propose a novel circuit technique to generate a reduced voltage swing (RVS) signals for active power reduction on main buses and clocks. This is achieved without performance degradation, without extra power supply requirement, and with minimum area overhead. The technique stops the discharge path on the net that is swinging low at a certain voltage value. It reduces active power on the target net by as much as 33&amp;#37; compared to traditional full swing signaling. The logic 0 voltage value is programmable through control bits. If desired, the reduced-swing mode can also be disabled. The approach assumes that the logic 0 voltage value is always less than the threshold voltage of the nMOS receivers, which eliminate the need of the low to high voltage translation. The reduced noise margin and the increased leakage on the receiver transistors using this approach have been addressed through the selective usage of multithreshold voltage (MTV) devices and the programmability of the low voltage value.</description><Author>Khader Mohammad, Ayman Dodin, Bao Liu, and Sos Agaian</Author><copyright>Copyright &amp;#x00A9; 2009 Khader Mohammad et al. All rights reserved.</copyright></item><item><title>Low-Cost Allocator Implementations for Networks-on-Chip Routers</title><link>http://www.hindawi.com/journals/vlsi/2009/415646/</link><description>Cost-effective Networks-on-Chip (NoCs) routers are important for future SoCs and embedded devices. Implementation results show that the generic virtual channel allocator (VA) and the generic switch allocator (SA) of a router consume large amount of area and power. In this paper, after a careful study of the working principle of a VA and the utilization statistics of its arbiters, opportunities to simplify the generic VA are identified. Then, the deadlock problem for a combined switch and virtual channel allocator (SVA) is studied. Next, the impact of the VA simplification on the router critical paths is analyzed. Finally, the generic architecture and two low-cost architectures proposed (the look-ahead, and the SVA) are evaluated with a cycle-accurate network simulator and detailed VLSI implementations. Results show that both the look-ahead and the SVA significantly reduce area and power compared to the generic architecture. Furthermore, cost savings are achieved without performance penalty.</description><Author>Min Zhang and Chiu-Sing Choy</Author><copyright>Copyright &amp;#x00A9; 2009 Min Zhang and Chiu-Sing Choy. All rights reserved.</copyright></item><item><title>Architectures and Arithmetic for Low Static Power Consumption in Nanoscale CMOS</title><link>http://www.hindawi.com/journals/vlsi/2009/749272/</link><description>This paper focuses on leakage reduction at architecture and arithmetic level. A methodology for considerable reduction of the static power consumption is shown. Simulations are done in a typical 130&amp;#x2009;nm CMOS technology. Based on the simulation results, the static power consumption is estimated and compared for different filter architectures. Substantial power reductions are shown in both FIR-filters and IIR-filters. Three different types of architectures, namely, bit-parallel, digit-serial, and bit-serial structures are used to demonstrate the methodology. The paper also shows that the relative power ratio is strongly dependent on the used word length; that is, the gain in power ratio is larger for longer word lengths. A static power ratio at 0.48 is shown for the bit-serial FIR-filter and a power ratio at 0.11 is shown in the arithmetic part of the FIR-filter. The static power ratio in the IIR-filter is 0.36 in the bit-serial filter and 0.06 in the arithmetic part of the filter. It is also shown that the use of storage, such as registers, relatively the arithmetic part, affects the power ratio. The relatively lower power consumption in the IIR-filter compared to the FIR-filter is due to the lower use of registers.</description><Author>Peter Nilsson</Author><copyright>Copyright &amp;#x00A9; 2009 Peter Nilsson. All rights reserved.</copyright></item><item><title>A Multilevel Congestion-Based Global Router</title><link>http://www.hindawi.com/journals/vlsi/2009/537341/</link><description>Routing in nanometer 
                  nodes creates an elevated level of importance 
                  for low-congestion routing. At the same time, 
                  advances in mathematical programming have 
                  increased the power to solve complex problems, 
                  such as the routing problem. Hence, new routing 
                  methods need to be developed that can 
combine advanced mathematical programming and modeling techniques 
to provide low-congestion solutions. In this paper, a hierarchical 
mathematical programming-based global routing technique that 
considers congestion is proposed. The main contributions presented 
in this paper include (i) implementation of congestion estimation 
based on actual routing solutions versus purely probabilistic 
techniques, (ii) development of a congestion-based hierarchy for 
solving the global routing problem, and (iii) generation of a 
robust framework for solving the routing problem using 
mathematical programming techniques. Experimental results 
illustrate that the proposed global router is capable of reducing 
congestion and overflow by as much as 36&amp;#37; compared to the 
state-of-the-art mathematical programming models.</description><Author>Logan Rakai, Laleh Behjat, Shawki Areibi, and Tamas Terlaky</Author><copyright>Copyright &amp;#x00A9; 2009 Logan Rakai et al. All rights reserved.</copyright></item><item><title>Floorplan-Driven Multivoltage High-Level Synthesis</title><link>http://www.hindawi.com/journals/vlsi/2009/156751/</link><description>As the semiconductor technology advances, interconnect plays a more and more important role in power consumption in VLSI systems. This also imposes a challenge in high-level synthesis, in which physical information is limited and conventionally considered after high-level synthesis. To close the gap between high-level synthesis and physical implementation, integration of physical synthesis and high-level synthesis is essential. In this paper, a technique named FloM is proposed for integrating floorplanning into high-level synthesis of VLSI system with multivoltage datapath. Experimental results obtained show that the proposed technique is effective and the energy consumed by both the datapath and the wires can be reduced by more than 40&amp;#37;.</description><Author>Xianwu Xing and Ching Chuen Jong</Author><copyright>Copyright &amp;#x00A9; 2009 Xianwu Xing and Ching Chuen Jong. All rights reserved.</copyright></item><item><title>A New XOR Structure Based on Resonant-Tunneling High Electron Mobility Transistor</title><link>http://www.hindawi.com/journals/vlsi/2009/803974/</link><description>A new structure for an exclusive-OR (XOR) gate based on the resonant-tunneling high electron mobility transistor (RTHEMT) is introduced which comprises only an RTHEMT and two FETs. Calculations are done by utilizing a new subcircuit model for simulating the RTHEMT in the SPICE simulator. Details of the design, input, and output values and margins, delay of each transition, maximum operating frequency, static and dynamic power dissipations of the new structure are discussed and calculated and the performance is compared with other XOR gates which confirm  that the presented structure has a high performance. Furthermore, to the best of authors&amp;#39; knowledge, it has the least component count in comparison to the existing structures.</description><Author>Mohammad Javad Sharifi and Davoud Bahrepour</Author><copyright>Copyright &amp;#x00A9; 2009 Mohammad Javad Sharifi and Davoud Bahrepour. All rights reserved.</copyright></item><item><title>Recent Advances on the Design of High-Gain Wideband Operational Transconductance Amplifiers</title><link>http://www.hindawi.com/journals/vlsi/2009/323595/</link><description>Feed-forward techniques are explored for the design of high-frequency Operational
Transconductance Amplifiers (OTAs). For single-stage amplifiers, a recycling folded-cascode OTA presents twice
the GBW (197.2&amp;#x2009;MHz versus 106.3&amp;#x2009;MHz) and more than twice the slew rate (231.1&amp;#x2009;V/&amp;#x03BC;s versus 99.3&amp;#x2009;V/&amp;#x03BC;s) as a conventional folded cascode OTA for the same load, power consumption, and transistor dimensions. It is demonstrated that the efficiency of the recycling folded-cascode is equivalent to that of a telescopic OTA. As for multistage amplifiers, a No-Capacitor Feed-Forward (NCFF) compensation scheme which uses a high-frequency pole-zero doublet to obtain
greater than 90&amp;#x2009;dB DC gain, GBW of 325&amp;#x2009;MHz and better than 70&amp;#x2218; phase margin is discussed. The settling-time- of the NCFF topology can be faster than that of OTAs with Miller compensation. Experimental results for the recycling folded-cascode OTA fabricated in TSMC 0.18&amp;#x2009;&amp;#x03BC;m CMOS, and results of the NCFF demonstrate the efficiency and feasibility of the feed-forward schemes.</description><Author>Rida Assaad and Jose Silva-Martinez</Author><copyright>Copyright &amp;#x00A9; 2009 Rida Assaad and Jose Silva-Martinez. All rights reserved.</copyright></item><item><title>Device and Circuit Design Challenges in the Digital Subthreshold 
                        Region for Ultralow-Power Applications</title><link>http://www.hindawi.com/journals/vlsi/2009/283702/</link><description>In recent years, subthreshold operation has gained a lot of attention due to ultra low-power consumption in applications requiring low to medium performance. It has also been shown that by optimizing the device structure, power consumption of digital subthreshold
                   logic can be further minimized while improving its performance. Therefore, subthreshold circuit design is very promising for future ultra low-energy sensor applications as well as high-performance parallel processing. This paper deals with various device and circuit design challenges associated with the state of the art in optimal digital subthreshold circuit design and reviews device design methodologies and circuit topologies for optimal digital subthreshold operation. This paper identifies the suitable candidates for subthreshold operation at device and circuit levels for optimal subthreshold circuit design and provides an effective roadmap for digital designers interested to work with ultra low-power applications.</description><Author>Ramesh Vaddi, S. Dasgupta, and R. P. Agarwal</Author><copyright>Copyright &amp;#x00A9; 2009 Ramesh Vaddi et al. All rights reserved.</copyright></item><item><title>Networks-On-Chip Based on Dynamic Wormhole Packet Identity Mapping Management</title><link>http://www.hindawi.com/journals/vlsi/2009/941701/</link><description>This paper presents a network-on-chip (NoC)
with flexible infrastructure based on dynamic wormhole packet
identity management. The NoCs are developed based on a VHDL
approach and support the design flexibility. The on-chip router
uses a wormhole packet switching method with a synchronous
parallel pipeline technique. Routing algorithms and dynamic
wormhole local packet identity (ID-tag) mapping management are
proposed to support a wire sharing methodology and an ID slot
division multiplexing technique. At each communication link, flits
belonging to the same message have the same local ID-tag, and the
ID-tag is updated before the packet enters the next communication
link by using an ID-tag mapping management unit. Therefore, flits
from different messages can be interleaved, identified, and routed
according to their allocated ID slots. Our NoC guarantees in order
and lossless message delivery.</description><Author>Faizal A. Samman, Thomas Hollstein, and Manfred Glesner</Author><copyright>Copyright &amp;#x00A9; 2009 Faizal A. Samman et al. All rights reserved.</copyright></item><item><title>Particle Swarm Optimization for Constrained Instruction Scheduling</title><link>http://www.hindawi.com/journals/vlsi/2008/930610/</link><description>Instruction scheduling is an optimization phase aimed at balancing the performance-cost tradeoffs of the design of digital systems. In this paper, a formal framework is tailored in particular to find an optimal solution to the resource-constrained instruction scheduling problem in high-level synthesis. The scheduling problem is formulated as a discrete optimization problem and an efficient population-based search technique; particle swarm optimization (PSO) is incorporated for efficient pruning of the solution space. As PSO has proven to be successful in many applications in continuous optimization problems, the main contribution of this paper is to propose a new hybrid algorithm that combines PSO with the traditional list scheduling algorithm to solve the discrete problem of instruction scheduling. The performance of the proposed algorithms is evaluated on a set of HLS benchmarks, and the experimental results demonstrate that the proposed algorithm outperforms other scheduling metaheuristics and is a promising alternative for obtaining near optimal solutions to NP-complete scheduling problem instances.</description><Author>Rehab F. Abdel-Kader</Author><copyright>Copyright &amp;#x00A9; 2008 Rehab F. Abdel-Kader. All rights reserved.</copyright></item><item><title>Design and Characterization of the Next Generation Nanowire Amplifiers</title><link>http://www.hindawi.com/journals/vlsi/2008/190315/</link><description>Vertical nanowire surrounding gate field effect transistors (SGFETs) provide full gate control over the channel to eliminate short-channel effects. This paper presents design and characterization of a differential pair amplifier using NMOS and PMOS SGFETs with a 10&amp;#x2009;nm channel length and a 2&amp;#x2009;nm channel radius. The amplifier dissipates 5&amp;#x2009;&amp;#x03BC;W power and provides 5&amp;#x2009;THz bandwidth with a voltage gain of 16, a linear output voltage swing of 0.5&amp;#x2009;V, and a distortion better than 3&amp;#37; from a 1.8&amp;#x2009;V power supply and a 20&amp;#x2009;aF capacitive load. The 2nd- and 3rd-order harmonic distortions of the amplifier are &amp;#x2212;40&amp;#x2009;dBm and &amp;#x2212;52&amp;#x2009;dBm, respectively, and the 3rd-order intermodulation is &amp;#x2212;24&amp;#x2009;dBm for a two-tone input signal with 10&amp;#x2009;mV amplitude and 10&amp;#x2009;GHz frequency spacing. All these parameters indicate that vertical nanowire surrounding gate transistors are promising candidates for the next generation high-speed analog and VLSI technologies.</description><Author>Sotoudeh Hamedi-Hagh and Ahmet Bindal</Author><copyright>Copyright &amp;#x00A9; 2008 Sotoudeh Hamedi-Hagh and Ahmet Bindal. All rights reserved.</copyright></item><item><title>An Energy-Efficient Multiwire Error Control Scheme for Reliable On-Chip Interconnects Using Hamming Product Codes</title><link>http://www.hindawi.com/journals/vlsi/2008/109490/</link><description>We propose an energy-efficient error control scheme for on-chip interconnects capable of correcting a combination of multiple random and burst errors. The iterative decoding method, interleaver, using two-dimensional Hamming product codes and a simplified type-II hybrid ARQ, achieves several orders of magnitude improvement in residual flit-error rate for multiwire errors and up to 45&amp;#37; improvement in throughput in high noise environments. For a given system reliability requirement, the proposed error control scheme yields up to 50&amp;#37; energy improvement over other error correction schemes. The low overhead of our approach makes it suitable for implementation in on-chip interconnect switches.</description><Author>Bo Fu and Paul Ampadu</Author><copyright>Copyright &amp;#x00A9; 2008 Bo Fu and Paul Ampadu. All rights reserved.</copyright></item><item><title>A Programmable Max-Log-MAP Turbo Decoder Implementation</title><link>http://www.hindawi.com/journals/vlsi/2008/319095/</link><description>In the advent of very high data rates of the
upcoming 3G long-term evolution telecommunication systems,
there is a crucial need for efficient and flexible turbo decoder
implementations. In this study, a max-log-MAP turbo decoder is
implemented as an application-specific instruction-set processor.
The processor is accompanied with accelerating computing units,
which can be controlled in detail. With a novel memory interface,
the dual-port memory for extrinsic information is avoided. As a
result, processing one trellis stage with max-log-MAP algorithm
takes only 1.02 clock cycles on average, which is comparable to
pure hardware decoders. With six turbo iterations and 277&amp;#x2009;MHz
clock frequency 22.7&amp;#x2009;Mbps, decoding speed is achieved on 130&amp;#x2009;nm technology.</description><Author>Perttu Salmela, Harri Sorokin, and Jarmo Takala</Author><copyright>Copyright &amp;#x00A9; 2008 Perttu Salmela et al. All rights reserved.</copyright></item><item><title>A Robust Low-Voltage On-Chip LDO Voltage  Regulator in 180&amp;#8201;nm</title><link>http://www.hindawi.com/journals/vlsi/2008/259281/</link><description>This paper proposes a capacitor-less LDO with improved steady-state response and reduced transient overshoots and undershoots. The novelty in this approach is that the regulation is improved to a greater extent by the improved error amplifier in addition to improved transient response against five vital process corners. Also entire quiescent current required is kept below 100&amp;#8201;&amp;#x003BC;A. This LDO voltage regulator provides a constant 1.2&amp;#x02009;V output voltage against all load currents from zero to 50&amp;#x02009;mA with a maximum voltage drop of 200&amp;#x02009;mV. It is designed and tested using Spectre, targeted to be fabricated on UMC 180&amp;#x02009;nm.</description><Author>Sreehari Rao Patri and K. S. R. Krishna Prasad</Author><copyright>Copyright &amp;#x00A9; 2008 Sreehari Rao Patri and K. S. R. Krishna Prasad. All rights reserved.</copyright></item><item><title>A Phase-Locked Loop with 30&amp;#37; Jitter Reduction  Using Separate Regulators</title><link>http://www.hindawi.com/journals/vlsi/2008/512946/</link><description>A phase-locked loop (PLL) using separate regulators to reject the supply noise
is proposed in this paper. Two regulators, REG1 and REG2, are used to prevent 
the supply noise from the charge pump (CP) and the voltage-controlled oscillator
(VCO), respectively. By using separate regulators, the area and the power consumption
of the regulator can be reduced. Moreover, the jitter of the proposed PLL
is proven on silicon to be less sensitive to the supply noise. The proposed PLL is
fabricated using a typical 0.35&amp;#x02009;&amp;#x3BC;m 2P4M 
CMOS process. The peak-to-peak jitter
(P2P jitter) of the proposed PLL is measured to be 81.8&amp;#x02009;ps at 80&amp;#x02009;MHz when 
a 250&amp;#x02009;mVrms supply noise is added. By contrast, the P2P jitter is measured to be 118.2&amp;#x02009;ps 
without the two regulators when the same supply noise is coupled.</description><Author>Tzung-Je Lee and Chua-Chin Wang</Author><copyright>Copyright &amp;#x00A9; 2008 Tzung-Je Lee and Chua-Chin Wang. All rights reserved.</copyright></item><item><title>Fine Control of Local Whitespace in Placement</title><link>http://www.hindawi.com/journals/vlsi/2008/517919/</link><description>In modern design methodologies, a large fraction of chip area during placement is left 
                  unused by standard cells and
allocated as &amp;#8220;whitespace.&amp;#8221; This is done for a variety of reasons
including the need for subsequent buffer insertion, as a means to
ensure routability, signal integrity, and low coupling capacitance
between wires, and to improve yield through DFM optimizations.
To this end, layout constraints often require a certain minimum
fraction of whitespace in each region of the chip. Our work
introduces several techniques for allocation of whitespace in
global, detail, and incremental placement. Our experiments show
how to efficiently improve wirelength by reallocating whitespace
in legal placements at the large scale. Additionally, for the
first time in the literature, we empirically demonstrate high-precision
control of whitespace in designs with macros and
obstacles. Our techniques consistently improve the quality of
whitespace allocation of top-down as well as analytical placement
methods and achieve low penalties on designs from the ISPD 2006
placement contest with minimal interconnect increase.</description><Author>Jarrod A. Roy, David A. Papa, and Igor L. Markov</Author><copyright>Copyright &amp;#x00A9; 2008 Jarrod A. Roy et al. All rights reserved.</copyright></item></channel></rss>
