Abstract

We proposed footless domino logic buffer circuit. It minimizes redundant switching at the dynamic and the output nodes. The proposed circuit avoids propagation of precharge pulse to the output node and allows the dynamic node which saves power consumption. Simulation is done using 0.18 µm CMOS technology. We have calculated the power consumption, delay, and power delay product of the proposed circuit and compared the results with the existing circuits for different logic function, loading condition, clock frequency, temperature, and power supply. Our proposed circuit reduces power consumption and power delay product as compared to the existing circuits.

1. Introduction

Domino logic circuits are used in wide applications such as microprocessor [1] and memory [2]. It has superior advantage over static logic circuit as it requires less area and reduces output load capacitance hence enhancing the speed. Realization of wide fan-in gates using static logic circuit requires long stack of pMOS and nMOS which is not practical, and it increases the delay and area [3, 4]. But domino logic uses dual phase, namely, precharge and evaluation, to implement complex circuit with single evaluation network [5]. Domino circuit has drawback of high power consumption due to clock loading and reduces noise margin due to charge sharing and charge leakage. Charge sharing is compensated by adding keeper transistor.

Buffer is essential to drive the output of domino circuit into the next stage [6, 7]. Static CMOS logic circuit consumes power during the toggling of the output state. Domino logic circuit consumes power due to the unwanted redundant switching at dynamic and output nodes. This redundant switching of domino logic circuit consumes more power as compared to static CMOS circuit. Different circuits are proposed in the literature to deal with this issue. Single-phase domino logic [8] and static switching pulse domino logic [9] reduce the redundant switching at both dynamic and output node. True single-phase clock domino logic (TSPC) [10, 11], limited switch dynamic logic (LSDL) [12], and pseudodynamic buffer (PDB) [13] reduce the redundant switching only at output node.

Power dissipation of the domino circuit is divided into three components [8]: is the power consumed during capacitance charging and discharging, is the total leakage power of the circuit and this power increases as the technology is scaled down, and is the power dissipated when direct current flows from power supply to ground.

Consider the following: where is the switching activity at the output and dynamic node and it depends on the gate topology and inputs, is the capacitive load at the evaluation node, and is the clock frequency.

Consider the following: where is the combination of subthreshold and gate oxide leakage current.

Consider the following: for domino logic gate is the contention current that flows between the evaluation network and pMOS keeper during evaluation mode. This power dissipation must be kept low for better operation of the domino circuit.

In this paper, we proposed switching-aware technique which minimizes redundant switching at the dynamic and output nodes, and the circuit behaves like static CMOS circuits. Pull-up transistor is controlled by conditional pulse generator. The proposed circuit avoids propagation of the precharge pulse to the output node and allows it to the dynamic node. The remainder of the paper is organized as follows. Previous proposed techniques are described in Section 2. Proposed structure is described in Section 3. Simulation results are presented in Section 4, and conclusion is presented in Section 5.

2. Previous Work

Standard footless domino logic circuit is shown in Figure 1. When input is kept high, the operation of the circuit is characterized in two operating phases, as shown in Figure 2. During precharge phase, M1 turns ON, dynamic node is charged to high voltage, and output is discharged to low voltage. During evaluation phase, dynamic node is discharged to low voltage and output is charged to high voltage. When input is kept low, dynamic node maintains high voltage in both operating phases. Here, propagation of precharge pulse is needed at the dynamic node and prevented at the output node to make the circuit stable. This redundant switching increases the power consumption.

Circuit techniques have been proposed in the literature such as single-phase domino logic and static switching domino logic. The main idea regarding these circuit designs is to reduce redundant switching at both dynamic and output nodes.

2.1. Single-Phase Domino Logic (SP-Domino)

Single-phase domino logic is similar to the clock-delayed domino logic as shown in Figure 3 and its voltage characteristics are shown in Figure 4. The latest arriving input does not arrive before the rising edge of the delayed clock [8]. The gate has single phase as both pull-up and pull-down networks of the dynamic node occur during the evaluation phase. Transistor M1 works as pullup and keeper. Pulse generator produces signal which turns ON M1 unconditionally at the start of the evaluation cycle. If both transistors M1 and M10 turn on simultaneously, small contention current flows between them for short duration of pulse at the gate of M1. If the transistor M10 turns OFF at the start of the evaluation phase, M1 charges the dynamic node to high voltage. If the value of dynamic node is low at the end of the pulse signal , M7 remains OFF and is pulled up high by transistor M4. Charging operation of the dynamic node starts after the pulse signal returns to low voltage. The logical expression for pulse is where and are the clock-delayed signal and its delayed inverse.

Design of SP-domino has several flaws. Size of M1 has lack of flexibility. If the size of M1 increases, keeper ratio increases. Keeper ratio is defined as the ratio of current driving capability of transistor M1 to transistor M10. High keeper ratio increases the contention current and delay. High keeper ratio has unsymmetrical rise and fall time of the output signal. To have symmetric rise and fall time, must be a fixed value.

2.2. Static Switching Pulse Domino (SSPD)

SSPD is similar to SP-domino with static input and output characteristics [9]. SP-domino uses single transistor as pullup and keeper, but in case of SSPD it employs separate transistors M1 and M2. Both transistors M1 and M2 never turn ON simultaneously. SP-Domino has a lack of flexibility in designing the size of transistor M1 to get symmetrical rise and fall delay of the output. SSPD allows independent tuning of rise and fall delays.

SSPD technique employs a conditional pulse generator (CPG) as shown in Figure 5 and its voltage characteristics are shown in Figure 6. The CPG generates pulse, M1 turns OFF only when the dynamic node has been discharged or held low in the last evaluation cycle and keeper M2 turns OFF. If the dynamic node is not discharged, M1 is OFF by CPG. M8 is ON providing contention current by the keeper. CPG internally generates two additional clock phases CCLKd and CCLKi. Their behavior is related to the clock signal (CLK) and the dynamic node. The two clock phases utilized by the block G1 in CPG to produce pulse signal . Drawback of this technique is that it required complex conditional pulse generator. The logical expression for pulse is where and are the conditionally generated delayed and inverse phases of the original clock CLK.

2.3. Buffer Circuits to Reduce Redundant Switching at Output Node

Limited switch dynamic logic (LSDL) is similar to the standard domino logic circuit except that latch structure is added at the dynamic node [12]. This latch structure increases the parasitic capacitance at the dynamic node and eliminates the redundant switching at the output node but fails at the dynamic node. LSDL provides dual output without the need of dual rail signaling. There are two drawbacks of LSDL; first it requires latch circuit to every dynamic node which increases the power consumption and the area, and second it needs three clock transistors which increases the load capacitance of the clock signal.

TSPC dynamic logic is similar to the standard footed domino logic except that extra nMOS transistor is connected in the output inverter [10, 11]. This circuit requires 3 clock transistors and it increases the load capacitance of the clock signal and the power consumption. In the pseudo dynamic buffer, the source of the output inverter is connected to the source of the pull-down network [13]. It helps in preventing the propagating of precharge pulse to the output node when the input is high.

3. Proposed Structure

Circuit diagram of the proposed circuit the shown in Figure 7 and their voltage characteristic at different node are shown in Figure 8. The proposed circuit has static input and output characteristics. It uses delayed clock similar to SP-domino. The pMOS transistor M1 functions as pull up and M2 as keeper. Implementation of this buffer consist of pulse generator block which generates the pulse signal . Pulse signal controls the pull-up M1. Pulse generator consists of inverting the delayed inverters and the three inputs of the NAND gate. The inputs of the NAND gate are clock, its inverse delayed, and the output of the circuit. A pulse generator is conditional because the NAND gate only produces low voltage when all inputs are high. Timing diagram of clocks and output is shown in Figure 9. Using this technique, it avoids propagation of precharge pulse to the output node and allows it to the dynamic node.

Operation of the circuit is explained by considering the input logic. During clock, CLK is low, output of NAND gate is high, M1 turns OFF, and the dynamic node holds the previous value regardless of the input of the circuit. When the clock is high, operation of the circuit is explained in two cases.

Case  1. When the input is high, M5 turns ON. At the start of the clock, CLK is high, for a short period of time , and output are also high, the NAND gate output is low, and M1 and M5 both turn ON simultaneously and they produce contention current as shown in Figure 10. The size of M5 is large enough to discharge the dynamic node to low voltage. If the size of M5 is too small, then the logic of the circuit will change due to the contention current provided by pull-up transistor M1. After the delay of time , the NAND gate output goes high and M1 turns OFF as shown in Figure 11. Thus no further contention current flows to the pull-down network and dynamic node remains at logic low. Cross denotes OFF transistors.

Case  2. When the input is low, at the start of the clock, CLK is high, for a short period of time , the NAND gate output is low, and M1 turns ON and precharges the dynamic node to high voltage as shown in Figure 12. After the delay of time , the NAND gate output goes high, M1 turns OFF, the dynamic node remains the high voltage as shown in Figure 13. The logical expression for pulse is where is the clock signal, is the delayed inverse of the original clock, and OUT is the output signal.

4. Simulation Results

The proposed and existing circuits are simulated using HSPICE in the high performance 180 nm predictive technology [14]. The supply voltage in the simulations is 1.8 V and the clock rate is 200 MHz with 50% duty cycle (clock period is 5 ns). Rise and fall time of the clock rate is set equal to 10 ps. Transistor size is set by =27 L min, for all circuits. Worst case delay is determined from the input to the output node . Power consumption is determined when the input is at the high voltage. Standby power is measured when input of the circuit is low.

Comparison of power saving for various logic functions of the proposed domino circuit with standard footless domino circuit is tabulated in Table 1. In this comparison, clock frequency, input frequency, and load capacitance were set to 200 MHz, 50 MHz, and 100 fF. From the table, OR gates logic saves more power as compared to the AND gates logic.

Comparison of power consumption of the proposed circuit and the existing circuits such as standard footless domino circuit, SP-domino circuit, and SSPD circuit with clock frequency 200 MHz where load capacitance is varied is shown in Figure 14. As a result, at higher load capacitance, our proposed circuit saves higher power consumption as compared to the existing circuits. For capacitance 500 fF, our proposed circuit reduces power consumption by 69.6%, 18.03%, and 15.44% as compared to the standard footless domino, SP-domino, and SSPD techniques. Our proposed circuit has better delay as compared to SP-Domino and SSPD techniques except standard footless domino circuit as shown in Figure 15. Our proposed circuit has better power delay product, and at higher load capacitance its saving is large as compared to the existing circuits as shown in Figure 16. The proposed circuit has better standby product as compared to SP-domino technique and has higher value as compared to the other existing techniques as shown in Figure 17.

Figure 18 shows the comparison of power consumption of the proposed circuit and the existing circuits, and load capacitance is set 100 fF for different clock frequencies. It shows that power consumption increases as clock frequency increases, and maximum power saving is achieved at higher operating frequency. Reduction of power consumption by 66.86%, 38.27%, and 23.51% is compared to the standard footless circuit, SP-domino circuit, and SSPD circuit.

Figure 19 illustrates the relationship between the power consumption and temperature for the proposed circuit and the existing circuits, where clock frequency and load capacitance were set as 200 MHz and 100 fF. It shows that the proposed circuit is independent on temperature variation. For higher temperature, our proposed circuit has lower power consumption as compared to the existing circuits. At 110°C, the proposed circuit reduces power consumption by 78.55%, 38.7%, and 26.37% as compared to the standard footless circuit, SP-domino circuit, and SSPD circuit. Similarly, in Figure 20, delay versus temperature is represented. Our proposed circuit suffers little delay penalty as compared to the standard footless circuit and improves as compared to the other existing techniques. Figure 21 shows PDP versus temperature, our proposed circuit has minimum PDP as compared to existing techniques.

Figure 22 illustrated the power consumption for proposed circuit and the existing circuits for different supply voltages. The clock frequency is set 200 MHz and load capacitance is set as 100 fF. Our proposed circuit shows better power saving at higher power supply.

The layout of the standard domino and the proposed circuit is implemented in Tanner L-EDIT tool using 0.18 um standard CMOS technology as shown in Figure 23. Simulation results of pre- and postlayout simulations are summarized in Table 2. Clock frequency and load capacitance were set as 200 MHz and 100 fF, respectively. In postlayout simulation, parasitic capacitance is considered; therefore, total capacitance at the output node is increased. From Table 2, it is shown that postlayout simulations have higher delay, power consumption, and PDP as compared to the prelayout simulations. Further, the proposed circuit required 2.87 times larger area as compared to the standard domino circuit.

5. Conclusions

In this paper, new static buffer circuit is proposed. The proposed circuit minimizes the redundant switching at both dynamic and output nodes. The proposed buffer circuit consists of conditional pulse generator which controls the pull-up transistor of the circuit. This technique avoids propagation of precharge pulse to the output node and allows it to the dynamic node. The proposed circuit and the existing circuits such as the standard footless domino, single phase-pulse domino, and static switching pulse domino are simulated in 0.18 μm using HSPICE. Performance of the proposed structure is compared with the existing circuits for different clock frequency, loading condition, and temperature. Our proposed circuit saves higher power consumption as compared to the existing circuits. For capacitance 500 fF, our proposed circuit reduces power consumption by 69.6%, 18.03%, and 15.44% as compared to standard footless domino, SP-domino, and SSPD techniques. Layouts of the proposed and standard domino circuit are implemented using standard CMOS technology. Postlayout simulations increase the delay, power consumption, and power delay product as compared to the pre-layout simulations.

Acknowledgment

The authors duly acknowledge with gratitude the support from the Department of Information Technology, the Ministry of Communications and Information Technology, Government of India, New Delhi, India, through Special Manpower Development Program in VLSI and related software Phase-II (SMDP-II) Project in E&CE department, MNNIT Allahabad, India.