#### Abstract

In recent years, subthreshold operation has gained a lot of attention due to ultra low-power consumption in applications requiring low to medium performance. It has also been shown that by optimizing the device structure, power consumption of digital subthreshold logic can be further minimized while improving its performance. Therefore, subthreshold circuit design is very promising for future ultra low-energy sensor applications as well as high-performance parallel processing. This paper deals with various device and circuit design challenges associated with the state of the art in optimal digital subthreshold circuit design and reviews device design methodologies and circuit topologies for optimal digital subthreshold operation. This paper identifies the suitable candidates for subthreshold operation at device and circuit levels for optimal subthreshold circuit design and provides an effective roadmap for digital designers interested to work with ultra low-power applications.

#### 1. Introduction

In digital VLSI system design space, considerable attention has been given to the design of high-performance microprocessors. However, in recent years, the demand for power sensitive designs has grown significantly. This tremendous demand has mainly been due to the fast growth of battery-operated portable applications such as personal digital assistants, cellular phones, medical applications, wireless receivers, and other portable communication devices. Further, due to the aggressive scaling of transistor sizes for high-performance applications, not only does subthreshold leakage current increase exponentially, but also gate leakage and reverse-biased source-substrate and drain-substrate junctions band-to-band tunneling (BTBT) currents increase significantly. The tunneling currents are detrimental to the functionality of the devices. Well-known methods of low-power design (such as voltage scaling, switching activity reduction, architectural techniques of pipelining and parallelism, Computer-Aided Design (CAD) techniques of device sizing, interconnect, and logic optimization) may not be sufficient in many applications such as portable computing gadgets, medical electronics, where ultra low-power consumption with medium frequency of operation (tens to hundreds of MHz) is the primary requirement. To cope with this, several novel design techniques have been proposed. Energy recovery or adiabatic techniques promises to reduce power in computation by orders of magnitude. But it involves use of high-quality inductors which makes integration difficult. More recently, design of digital subthreshold logic was investigated with transistors operated in the subthreshold region (supply voltage () less than the threshold voltage ()) of the transistor) [1–4]. In such a technique the subthreshold leakage current of the device is used for necessary computation. This results in high transconductance gain of the devices (thereby providing near ideal voltage transfer characteristics of the logic gates) and reduced gate input capacitance. Its impact on system design is an exponential reduction of power at the cost of reduced performance. Digital computation using subthreshold leakage current has gained a wide interest in recent years to achieve ultralow-power consumptions in portable computing devices. Both logic and memory circuits have been extensively studied with design consideration at various levels of abstraction. It has been shown that using subthreshold operation, significant power savings can be achieved in applications requiring low to medium (ten to hundreds of megahertz) frequency of operation [5–7].

This paper is organized as follows. The scope of subthreshold operation for ultra-low-power applications is presented in Section 2. Various challenging issues confronting the current and future robust subthreshold circuit design are reviewed in Section 3. Section 4 presents various device level optimization methodologies identified for optimal subthreshold operation. Section 5 shows various circuit styles other than static CMOS suitable for robust subthreshold operation. Finally conclusions are drawn in Section 6.

#### 2. Scope of Subthreshold Operation for Ultralow Power Applications

Sub-threshold
circuits operate with a supply voltage that is less than the threshold of the
transistor—far below
traditional levels and consequently the transistor operates essentially on
leakage. While traditional digital CMOS has relied on running transistors
either in the ON state (saturation) or OFF
state (subthreshold), subthreshold circuits are either in an OFF state
or an almost-ON state (still in subthreshold regime but with weak inversion).
Running at these nonstandard operating points limits performance, which
remains acceptable for low-to-medium cost applications given the substantial
increase in the corresponding energy efficiency. As power is related quadratically
to the supply voltage, reducing the voltage to these ultra-low levels results
in a dramatic reduction in both power and energy consumption in digital
systems. Due to the exponential current-voltage (*I*-V) characteristics
of the transistor, subthreshold logic gates provide near ideal voltage transfer
characteristics. Furthermore, in the subthreshold region, the transistor input
capacitance is less than that of strong inversion operation. The transistor
input capacitance (), in subthreshold, is a combination of
intrinsic (oxide capacitance () and depletion capacitance ())
and parasitic (overlap capacitance (), fringing capacitances ((, ), etc.) of a transistor (Figure 1) and is given by [8]

In contrast, the input capacitance in strong inversion operation is dominated by the oxide capacitance. Due to the smaller capacitance and lower supply voltage (< threshold voltage of the transistor), digital subthreshold circuits consume less power than their strong inversion counterpart at a particular frequency of operation. However, since the subthreshold leakage current is used as the operating current in subthreshold operation, these circuits cannot be operated at very high frequencies. Figure 2 illustrates the region of operation for digital subthreshold operation.

The potential for minimizing energy at the cost of speed degradation defines the following set of applications for which subthreshold circuits are well suited. (a) Energy-constrained applications such as wireless sensor nodes, RFID tags, medical equipments such as hearing aids and pace-maker, wearable computing or implants, Personal digital assistants, energy scavenging applications, and Laptops, which are dominated primarily by the need to minimize energy consumption and increase battery life time, speed is a secondary consideration for this class of applications, so subthreshold circuits offer a good solution.(b) Many burst mode applications, requiring high-performance for very short duration between extended periods of low-performance operation, Sub-threshold circuits can minimize energy for computations executed during the low-performance slots. This type of applications appears almost in every design, including the high-performance microprocessors, and cell phones.

#### 3. Roadmap or State-of-the-Art Challenging Issues in Digital Subthreshold Circuit Design

We have identified various device and circuit design challenges which need to be addressed for advancing the state-of-the-art in subthreshold circuit design, emphasizing the need for Codesign at all levels of abstraction like device, circuit and architecture, and so forth. This section provides an interesting insight and challenges for designers interested to work with energy-constrained applications, particularly taking advantage of subthreshold circuits.

*(1) Device Optimization for Subthreshold*

Sub-threshold circuits can greatly benefit
from redesigning the devices. In addition to technology scaling for improving
performance in subthreshold operation, devices need to be optimized for
subthreshold operations for higher operating frequency, since conventional
devices, which are optimized for the operation in a strong inversion region, may
not give optimal results for subthreshold operation [9–17].

*(2) Exploring Logic Families Optimal for Subthreshold Circuit Design*

The low results in a reduced ratio that can reduce robustness.
Static CMOS gates continue to function in subthreshold, but because of
enhanced problem of short-channel effects due to variations at nano scale,
logic families other than CMOS may offer greater resiliency to certain
variation sources such as voltage or process. Therefore, design of robust
subthreshold logic circuits exploring logic families other than static CMOS is
another open area for exploration [18–24].

*(3) PVT Insensitive Design Methodologies for Subthreshold*

Variability due to all sources, including Process, Voltage, and Temperature
(PVT) are all magnified in subthreshold circuits due to the exponential *I*-V
characteristics. So, there is a great need for coming up with a range of
effective techniques to combat this variability and design robust and reliable
subthreshold circuits [25–30].

*(4) Device Modeling and Sizing Analysis for Subthreshold*

For
,
delay increases exponentially with additional voltage scaling. Leakage current
integrates over the longer delay until leakage energy per operation exceeds the
active energy. There is a great need for developing models that capture this
effect and illustrating the impact of variations on minimum energy point,
optimal supply voltage, and threshold voltage for subthreshold circuits [31–34].

*(5) Need for Alternative Scaling Trends for Subthreshold*

The
scaling of transistor dimensions and electrical characteristics represents both
an opportunity and a threat for subthreshold
circuits. Device scaling offers a reduction in gate capacitance, and at super-threshold voltages, it offers a welcome reduction in switching energy and gate
delay. Scaling has also led to a dramatic increase in density (which was an
effective cost-reduction measure in the past). At the same time, device scaling
has brought about a number of problems in super-threshold circuits, including process variability, increased
subthreshold leakage, and increased gate leakage. The implications of
device scaling on super-threshold
circuits have been explored previously by many, however, no such focus has been
given to subthreshold
circuits. Transistor design is particularly important in the subthreshold regime
due to exponential sensitivities to , , and inverse subthreshold slope; therefore, it is
not immediately clear how subthreshold circuits will fare under device scaling. Very few have comprehensively studied
the effects that device scaling will have on subthresholdcircuits.
Therefore, clear understanding of the consequences of traditional
performance-driven scaling on subthreshold combinational blocks and SRAM cells
is important and also coming up with improved scaling strategies targeting the
needs of subthreshold circuits
[35–38].

*(6) Development of Subthreshold Compatible and Robust Memory Design*

Energy-efficient subthreshold design cannot succeed without robust and dense
ultra-low voltage memory design techniques. SRAM is an important component of
many ICs, and it can contribute a large fraction of the active and leakage
power consumption

The major
concerns with subthreshold memory design are the following.(a) Process variation in very small
dimension devices worsens the mismatch behavior in the traditional 6T SRAM cell
design. Random variation fundamentally affects the geometry and threshold
voltage of CMOS devices and is increasingly prominent in scaled technologies.
The problem is exacerbated in subthreshold, where device strength depends
exponentially on threshold voltage, and, in the presence of variation, relative
strengths cannot be guaranteed by sizing. As a result, the widely used 6T SRAM
cell, which relies on ratioed operation and is used to maintain density, fails
to operate in subthreshold. It is therefore important to have subthreshold
compatible SRAMs for subthreshold systems [39–43].(b) Reduced ON-to-OFF current ratios
complicate the reading and writing steps. None of the current approaches is
completely satisfactory and advancements in this area are also one of the most
crucial needs for the proliferation of extreme low-power systems.

*(7) Need for Codesign Approach for Subthreshold*

In the new
paradigm of computation with leakage, unfortunately, conventional wisdom can
deliver low-power systems but fails to provide the optimal or near-optimal
solution. For subthreshold operation, the lowest power for a given throughput
can be achieved only by a complete Codesign in all the aspects of device,
circuit, and architecture design [16, 17]. A lot of work need to be done in
that direction. In addition, a complete Codesign, at all levels of hierarchy
(device, circuit, and architecture) can further suppress the process variation
effects, reduce the power consumption, and improve the performance. Therefore,
variability-aware design strategies at all levels of abstraction device,
circuit, and architecture, are imperative to ensure the success and
functionality of power-efficient designs [44].

*(8) Developing Subthreshold Benchmark Circuits*

Since there are
no industrial subthreshold devices to compare the results with those of any optimized
subthreshold devices, there is a need to build benchmark circuits with the
subthreshold devices to compare issues such as variation immunity, power, and
performance with respect to constructed subthreshold circuits with standard
devices.

*(9) Advancement in CAD Tools*

Another
significant issue for subthreshold operation is system verification. Using
SPICE for verifying large systems rapidly becomes infeasible when the number of
process corners, temperature corners, and voltage supply values increases. Hspice
is too slow to run larger circuits and Nanosim can simulate large netlists in
reasonable time, but will not correctly model the devices for supply voltages
below 1 V. Therefore, need for either modifications of current simulators or a
new subthreshold circuit simulator to verify large systems running at such
ultra low voltages and to estimate the power dissipation of circuits. Advances
in CAD tools to account for this problem become necessary. These tools must
also address statistical distributions of delay and power introduced by local
variations [45, 46].

*(10) Ultradynamic Voltage Scaling (UDVS)*

Since an entire system may not be able to operate completely in subthreshold
region, there is a need for periodic switching of devices from strong inversion
to subthreshold operation. Therefore, UDV is a strong candidate for tying
together subthreshold operation and higher performance operation. Work related
to UDVS focusing on system integration can also be done. Decisions related to
the best interfaces among blocks operating at different effective rates and values will impact the system energy and delay. Selecting the best bus
protocols, level converters, and dc/dc converters for a system remains an open
problem. Also, theoretical work related to UDVS can investigate optimum
scheduling and control at the system level. The system level analysis can
consider all of the blocks and their modes of operation all the way from full
shutdown to full speed active mode [45, 46].

*(11) Architectures for Optimal Subthreshold Circuits*

There is much
future work opportunities in the area of architectures for subthreshold
circuits. One area is the use of pipelining and massively parallel
architectures that increase the activity factor of a circuit and requires
minimum supply voltage operation. There is also great need for developing
complete subthreshold standard cell library which will provide further
opportunities to optimize for minimal
energy dissipation [45, 46].

#### 4. Device-Level Optimization Methodologies for Subthreshold Operation

In conventional methods, standard transistors were operated in the subthreshold region to implement subthreshold logic. Standard transistors are the “super-threshold transistors” that are optimized for ultrahigh-performance design. It is only prudent to investigate if the standard transistors are well suited for subthreshold operation. The following device optimization methodologies have been identified, giving a good insight for coming up with new methodologies for present and future technology nodes.

##### 4.1. Bulk CMOS Technology for Subthreshold Operation

We have identified various device optimization methodologies for bulk CMOS technology in the subthreshold region and we hope this section will provide a good brief to the readers in identifying the gaps of technology [9–12].

###### 4.1.1. Device Optimization by Changing Channel Doping Profile for Subthreshold Operation

It is an
established fact that for scaled super-threshold transistors it is essential to
have halo and retrograde doping to suppress the short-channel effects. The main
functions of halo doping and retrograde wells are to reduce drain-induced
barrier lowering (DIBL), prevent body punch through, and control the threshold
voltage of the device independent of its subthreshold slope. However, in
subthreshold operation, it is worthwhile to note that the overall supply bias
is small (in the order of 0.15 V–0.3 V). Consequently, the effects of DIBL and
body punch through are extremely low. Further, as long as we meet budget, better subthreshold slope (*S*) leads to a better device. Since our interest is in the
region below the threshold voltage, it is not of any interest to us, where the
threshold voltage of the device actually is, as long as wemeet a predefined and *S*. Hence, it has been qualitatively and quantitatively shown
that the halo and retrogradedoping are not essential for subthreshold
device design [9].

The absence of the halo and retrograde doping has the following implications. (i) A simplified process technology in terms of process steps and cost.(ii)A significant reduction of the junction capacitances. The halo regions near the source-substrate and the drain-substrate regions significantly increase the junction capacitances thereby increasing the switching power and the delay of the logic gates. The absence of the halo/retrograde doping will reduce this junction capacitance.

It should, however, be noted that the doping profile in these optimized devices should have a high-to-low profile [9]. It is necessary to have a low doping level in the bulk of the device to (i) reduce the capacitance of the bottom junction;(ii) reduce substrate noise effects and parasitic latch-up.

Table 1 shows that the optimized subthreshold device improves in the values of subthreshold slope by 7.8%, junction capacitance by 34.7%, ON current by 60%, and PDP by 50% compared to the standard device due to above-mentioned factors.

###### 4.1.2. Oxide Thickness Optimization for Subthreshold Operation

It has been shown in [9] that halo and retrograde doping profiles are not necessary in devices for sub threshold operation (due to low-supply voltage), and instead, a high-low doping profile is suitable to achieve better subthreshold slope and lower junction capacitance. In that analysis, however, a minimum possible oxide thickness () provided by the technology is assumed for better sub threshold slope. However, minimum possible oxide thickness may not be optimum for subthreshold operation because it does not guarantee minimum energy consumption, which is the primary goal of subthreshold operation [12].

Although the
intrinsic gate capacitance of the transistor in the subthreshold operation is
dominated by depletion capacitance, the parasitic capacitances such as the
overlap and fringe capacitances (see Figure 1) will eventually dominate the
overall gate capacitance if the oxide is too thin. Therefore, a detail analysis
of the oxide thickness optimization of transistors for subthreshold operation
is necessary. Note that in conventional strong inversion operation, the
effective gate capacitance is dominated by oxide capacitance (; Figure 1) and a minimum ,
which improves the subthreshold slope (*S*),
is desirable to achieve high-performance. In the subthreshold operation, the
effective gate capacitance of a transistor is dominated by the intrinsic depletion and the
parasitic (both overlap and fringe) capacitances that strongly depend on , while overlap
capacitances are inversely proportional to , fringe capacitances are logarithmic function of
oxide thickness. In energy-constrained design, the primary objective of the
subthreshold operation will be to optimize to minimize these capacitances. However, change
in affects both
effective capacitance and the subthreshold swing. Figure 3(a) demonstrates that
reducing improves subthreshold swing *S*;
it, however, also increases in Figure
3(a) and beyond a certain point, the improvement in *S* is masked by the degradation in in Figure 3(a). The improvement in *S* though reduces the supply voltage
requirement to achieve a particular performance (i.e., a desired ON current);
it may, however, result in an overall increase in power ()
due to the increase in if
an optimum is
not chosen. Figure 3(b) shows the dynamic
energy versus for different
fanouts. It can be seen that the required for constant reduces with as expected. However, does not monotonically reduce with oxide
thickness, and the minimum occurs at around , which is larger than the minimum (1.2 nm) offered by
the technology. Further, the optimum (corresponding to minimum energy) is approximately the oxide thickness,
where the increase in exceeds
the improvement in subthreshold swing (Figure 3(a)). Note that optimum has a weak dependency on the fanout (Figure 3(b)),
however, the variation in minimum is less than 2%.
Exponential - (linear log -) relation in the subthreshold
region also ensures that optimum will provide minimum
dynamic energy at all performances (interpreted as )
as long as the circuit is operated at the subthreshold (). Therefore, it was
demonstrated that minimizing oxide thickness to improve subthreshold slope
does not necessarily provide minimum energy consumption in digital
subthreshold operation. It was shown that the oxide thickness should be
optimized considering the changes in both transistor effective capacitance and
the subthreshold slope to achieve minimum power consumption.

**(a)**

**(b)**

###### 4.1.3. New Device Sizing Utilizing Reverse Short-Channel Effects for Subthreshold Operation

In order to design optimal subthreshold circuits using CMOS devices that are targeted for super-threshold operation, it is crucial to develop techniques that can utilize the side effects that appear in this new regime. One such mechanism, the pronounced reverse SCE (RSCE), is used to achieve optimal performance in subthreshold circuits [11]. SCE (or roll-off) is an undesirable phenomenon in short-channel devices where decreases as the channel length is reduced. Variation in critical device dimensions translates into a larger variation in the threshold voltage as SCE worsens with increasing DIBL. Typically, non uniform HALO doping is used to mitigate this problem by making the depletion widths narrow and hence reducing the DIBL effect. As a byproduct of HALO, a short-channel device shows RSCE behavior where the decreases as the channel length is increased.

In subthreshold circuits, the SCE mechanism is not as strong as in super-threshold circuits because the drain-to-source voltage is very small. On the other hand, RSCE is still significant enough to affect the subthreshold performance. Moreover, current becomes an exponential function of in this regime, which makes it possible to use longer channel-length devices that utilize RSCE for improving drive current. Unlike the case in super-threshold circuits, using a longer channel length in subthreshold does not have a significant impact on the load capacitance. This is due to the reduced depletion capacitance under the gate. This method proposes transistor sizing considerations for subthreshold operation utilizing the RSCE to improve drive current, capacitance, process variation, subthreshold swing, and improved energy/dissipation.

Table 2 shows the implications of this device sizing at device and circuit-level properties. The subthreshold swing of the proposed method is 71 mV/dec, which is 16 mV lower than that of the conventional minimum channel device. The improved subthreshold slope reduces the off-current by 30% for the same on-current.

At 0.2 V, the ratio was 484 for the proposed scheme, which is a 2.5 times improvement over the conventional minimum channel device. Circuits using the proposed sizing scheme are more robust against Random Dopant Fluctuations (RDFs) because of the increased gate area at the optimal performance point. The proposed sizing scheme reduces delay and power dissipation simultaneously, which is not possible using conventional sizing schemes. As a result, a significant improvement in energy is obtained. Average delay in ISCAS benchmark circuits was improved by 13% while average power dissipation and energy dissipation were reduced by 31% and 40%, respectively.

###### 4.1.4. New Device Sizing Based on Subthreshold Logical Effort

In conventional
logical effort calculations, the optimal ratio of PMOS width () to
NMOS width () for achieving equivalent current drivability is
approximately 2.5 : 1 due to the mobility difference between the carriers
between the PMOS and NMOS devices. In addition, the effective width of a
transistor in a stack of *n* devices is roughly 1/*n* in the strong-inversion
region. This means that in order for an *n*-stack to conduct the same amount of
current as a single transistor, the devices in the stack must each be sized up
by a factor of *n*. Selection of the proper : ratio and effective width of
stacked transistors is crucial for achieving optimal performance. It was found
that the conventional logical effort framework based on strong-inversion
operation fails to do so for subthreshold logic due to the difference in the
transistor current behavior [10]. In the strong-inversion regime, drive current
is a first-or second-order function of the four MOS terminal voltages. Whereas,
the drive-current in subthreshold designs is an exponential function of the
terminal voltages. Hence we need a new design paradigm for optimal device
sizing based on the exponential current equation in the subthreshold region.
The optimal PMOS to NMOS width ratio in the subthreshold regime was found by
simulating a chain of equally sized inverters and observing the rise and fall
delays. Results show that a 1.5 : 1 ratio gives equal delays for the rise and
fall transitions at = 0.2 V, and a slightly smaller ratio is optimal for =
0.3 V [10]. This optimization scheme resulted in performance gains of up to 13.5%
for ISCAS benchmark circuits and 33.1% for component circuits operating in
subthreshold, which was shown to match theoretically attainable improvements.

##### 4.2. Double Gate-MOSFETs for Subthreshold Operation [13–16]

The Key benefits of choosing DGMOSFETs for subthreshold operation are as follows. (1) Double gate (DG)-MOSFET is promising for subthreshold operations due to its near-ideal subthreshold slope [13].(2) DG-MOSFET subthreshold operation shows that devices with longer channel length (compared to minimum gate length) can be used for robust subthreshold operation without any loss of performance [13].(3) Raised S/D structure is not necessary for subthreshold operation and can be simplified greatly [13].(4) Device will have better resiliency to , , , RDF variations due to underlying SOI structure.(5) By using optimum gate underlap, the parasitic capacitances can be significantly reduced resulting in higher performance and lower power consumption [14].(6) Independent control of front and back gates and asymmetric DGMOS can be effectively used for designing low-power and high-performance circuits [15].(7) Better scalability compared to bulk CMOS and Device characteristics including and can be optimized by the choice of device geometries, gate material, work-function, and so forth [15].(8)Junction capacitances will be significantly smaller compared to Bulk CMOS and leading to better power, delay performance. Various DGMOS device optimization methodologies for subthreshold operation have been identified and are presented in the following subsections.

###### 4.2.1. DGMOS Devices with Optimum Longer Channel Lengths and Simplified S/D Structure for Optimal Subthreshold Operation

It is well-known
that delay in CMOS circuits is proportional to the amount of load charge and
the inverse of operating current ( = /).
In super-threshold operations, assuming load capacitance is dominated by the
gate capacitance () of a load transistor, both the capacitance and
inverse of current are proportional to gate length, and, hence, delay is
proportional to the square of gate length (). In a
short-channel device where velocity saturation occurs, is a weak
function of gate length so that the dependence of delay on gate length is
mainly decided by , and, hence, delay increases linearly with the
increase of gate length in super-threshold operations. In contrast, as can be
seen in Figure 4, we observe that the optimal channel length for the maximum
performance of DG-MOSFET subthreshold logic is longer than the minimum when the of every device is matched [13]. As shown in Figure 5, is
almost constant regardless of in the DG-MOSFET subthreshold
device because the main component of for the subthreshold
DG-MOSFET is the gate overlap capacitance and fringing gate capacitance, which
are not dependent on . Note that the intrinsic capacitance of
DG-MOSFET is negligible [13]. Hence, dependence of delay in DG-MOSFET
subthreshold operation is mainly decided by . With a relatively
small increase in , a longer channel device has larger in the subthreshold region under the same condition due to the smaller
subthreshold slope. Note that in the subthreshold region is
decided by the subthreshold slope (*S*) only if is fixed. of
each device is matched with different by adjusting metal gate work
functions [13].

As shown in Figure
6, *S* of the short-channel device is larger than that of the long-channel device
due to the short-channel effect. Figure 6 also shows the dependency of to
*S* in the subthreshold region. Since the current does not increase with once *S* approaches the ideal limit (Figure 6), there is an optimal for
a minimum delay as shown in Figure 4. Hence, the optimal channel length for the
subthreshold operation is the minimum channel length that has an ideal subthreshold
slope.

Figure 7 shows that short-channel device is more sensitive to variation compared to long- channel device due to the drain-induced barrier lowering. Figure 7 also shows that variation in causes negligible change in for long- channel device while short-channel devices experience relatively large amount of variation due to variation. Figure 7 shows that variation in causes around variation for long- channel symmetric devices due to the volume inversion effect. The short-channel device experiences more variation due to two-dimensional short-channel effects in addition to the volume inversion [13]. So, long-channel device will be more suitable more subthreshold operation than short-channel devices.

###### 4.2.2. DGMOS Devices with Optimum Underlap for Subthreshold Operation

The impact of gate underlap on the effective gate capacitance of double-gate MOS (DGMOS) transistor for digital-subthreshold operation is analyzed in this paper. It shows that with optimum gate underlap, the parasitic fringe capacitances of DGMOS can be significantly reduced resulting in higher performance and lower power consumption [14]. Figure 8 shows the schematic of an underlap DGMOS device. The parasitic capacitances of DGMOS include the overlap () and the fringe () capacitances. Since, in an underlap device there is no , the effective gate capacitance () is dominated by . The fringe capacitance of DGMOS consists of inner () and outer () fringe components, which strongly depend on the device geometry.

It can be seen from Figure 9that the effective gate capacitance initially decreases with the increase in underlap and then becomes flat. This is because is dominated by the fringe capacitance (), which is a logarithmic function of the underlap. In contrast, of the device operated in strong inversion is dominated by the gate-oxide capacitance and hence does not vary considerably with underlap. While decreases with the gate underlap, ( at = = 0.2 V ()) and ( at = 0) also decrease with the underlap (Figure 10). It can be observed that initially the percentage reduction in is more than that of . This indicates that in this region the delay of the circuit with underlap will be more than that of no-underlap case. Beyond a certain (15 nm), still reduces logarithmically with , while decreases only linearly resulting in less percentage reduction than (Figure 10). Consequently, for 15 nm, the delay of the RO decreases with the increase in . Though the delay of the RO first increases with the underlap and then decreases, both power and PDP reduce monotonically with underlap. It can be observed that 40% improvement in delay can be achieved with optimum with 7.3× reduction in PDP for a full adder circuit. It can be seen from Table 3 that the above subthreshold ( = 0.2 V) full adder circuit with 50 nm underlap DGMOS devices can be operated at 1.25 GHz frequency with 6.2 less energy consumption than zero-underlap device [14].

###### 4.2.3. DGSOI Technology with Codesign Methodology for Optimal Subthreshold Operation

This presents a design methodology in all the levels of hierarchy (device, circuit and architecture) for ultralow-power digital subthreshold operation (). It has been demonstrated that conventional design techniques are not optimal for subthreshold design. By proper Codesign [16, 17] it is possible to obtain hundreds of MHz of performance in subthreshold systems with very low-power. Further demonstrated that double-gate MOSFETs are better suited for subthreshold operation (~10 higher throughput at iso-power) than bulk MOSFETs [16]. This is due mainly to the fact that DG-SOI has no intrinsic capacitance in the subthreshold region.

Double Gate MOS (DGMOS) transistors are suitable for subthreshold operation due to their near ideal subthreshold slope and negligible junction capacitance. Due to the thin fully depleted silicon body sandwiched between two gates, these devices have an excellent gate control over the channel. Furthermore, the undoped thin silicon body provides negligible source/drain p-n junction capacitance, which largely enhances the circuit performance. In subthreshold operation, the intrinsic capacitance of DGMOS is also negligible and is very weakly dependent on the channel length. For iso- conditions, Table 4 presents a comparison of the important properties for the standard and optimized bulk and DG-SOI devices for subthreshold operation. It can be noted that due to near ideal subthreshold slope, the DG-SOI devices have almost an order of magnitude higher ON-current compared to the bulk devices [16]. Table 4 illustrates the PDP of the bulk inverter and the SOI inverter (driving another inverter) operating in subthreshold regime. Note that the DG-SOI inverter has almost one order of magnitude lower PDP than the corresponding bulk device. This can be ascribed to the fact that the intrinsic capacitance of DG-SOI is negligibly small and hence the switching energy is extremely low. This makes the DG-SOI an extremely powerful technology to do subthreshold design. Sub-pseudo NMOS is also more efficient than sub-CMOS in terms of PDP. This is true in both the bulk and the DGSOI technologies. Simulation results (for both the technologies) of a pseudo NMOS inverter (driving an identical inverter) and a CMOS inverter are compared in Table 4. We observe that in the bulk subthreshold region, pseudo-NMOS gives approximately 20% improvement in PDP compared to CMOS. In DG-SOI the improvement is more than 30%. With the device/circuit/architectural optimizations, the throughput obtained is more than two times better (for iso-power) than the conventional design (Table 4) in case of the bulk technology. The same strategy has been applied to DG-SOI which results in an improvement of 3.8 in the throughput at iso-power conditions [16]. Thus we may note that significant improvement can be achieved by proper Codesigning in all aspects namely, device, circuit and architecture. Overall comparison of the performance of the two technologies in subthreshold regime in terms of power-throughput tradeoff of the FIR filter after optimization in device/circuit and architectural levels for both the bulk and the DG-SOI technology illustrates that the DG-SOI technology has more than 10 improvements in throughput at iso-power compared to the bulk technology. This is due, mainly to the fact, that the DG-SOI in the subthreshold domain has no intrinsic capacitance, although the bulk transistors do. This significant lowering of the device capacitance increases the throughput of the overall system at iso-power. As a summary we have the following.(i) By proper Codesign in aspects of device/ circuit/architecture we can improve the throughput at iso-power in the subthreshold region.(ii) DG-SOI MOSFETs inherently have no intrinsic capacitance in the subthreshold region, which gives significant improvement in PDP and DG-SOI is better suited to subthreshold operation than the corresponding bulk technology.

###### 4.3. Carbon Nanotube (CNFETs) Technology for Subthreshold Operation

Aggressive scaling of CMOS devices over different technology generations has led to higher integration density and performance. However, “short-channel effects” such as exponential increase in leakage current and large parameter variations stand in the way of scaling the devices much beyond 10 nm. Hence, research has started in earnest to consider alternative devices and circuit architecture in a sub-10-nm transistor era. Carbon nano tubes (CNTs) and molecular transistors have already gained widespread attention as possible alternative nanoscale transistors. CNTs are sheets of graphite rolled in the shape of a tube. Depending on the direction in which the nanotubes are rolled (chirality), they can be either metallic or semiconducting. Since their inception in the early 1990s, there has been immense research concerning the electrical properties of CNTs. The semiconducting nanotubes have been used in high-performance transistors where the channel is the nanotube itself. High-performance carbon nanotube field-effect transistors (CNFETs) with very high “on”-currents have been reported and the device physics has evolved [47–55]. As high mobility devices are being investigated, near ballistic transport no longer seems impossible. Absence of scattering in the channel is the characteristic of ballistic devices [50]. This makes them ultrahigh speed and apt for high-performance circuit design. The theory of CNT transistors is still primitive and the technology is still nascent.

In order to determine whether or not the CNFET meets the performance/device requirement, a comparison of the traditional MOSFET and the newly developed CNFET was done. Before the comparison, the authors have made the assumption that the CNFET takes on the same characteristics as the MOSFET [51]. The parameter code for the MOSFET and CNFET was developed by Arijit Raychowdury, graduate student mentor, electrical and computer engineering at Purdue University. To develop the correct FETs circuits, the authors in [52] used the parameter codes as include file within their main circuit codes. To compare the two types of transistors they designed and tested the inverter, ring oscillator, full adder, and the 4-bit ripple carrier circuits made of both MOSFETs and CNFETs.

Table 5 shows that in super-threshold operation, Ring oscillator constructed using CNFETs has frequency around 2 K times faster than the MOSFET-based Ring oscillator circuit and the Full adder is 125 times faster with just 1% PDP of an equivalent MOSFET-based Full adder circuit and 4-bit CNFET RCA circuit is 61 times feaster with 1% PDP of an equivalent MOSFET-based RCA circuit. Table 6 shows in subthreshold operation, Ring oscillator constructed using CNFETs has frequency around 8.4 K times faster than the MOSFET-based RO circuit and 4-bit RCA circuit designed with CNFETs are 440 times faster and with only 0.3% of PDP of an equivalent MOSFET-based 4-bit RCA circuit at 65 nm. This shows the superiority of CNFET based circuits compared to MOSFET-based circuits both for subthreshold and super-threshold operations and particularly for subthreshold operation. Sub-threshold MOSFET Ring oscillator operates at 85% speed lower compared to super-threshold MOSFET Ring oscillator. Whereas sub-threshold CNFET Ring oscillator operates at only 36% speed lower compared to super-threshold CNFET Ring oscillator.

#### 5. Logic Families for Subthreshold Operation

In this section, we will evaluate the scope of various logic families other than static CMOS for designing optimal subthreshold logic circuits. We will evaluate the robustness, power, and performance improvements that can be brought by various logic families other than CMOS for subthreshold operation. The following logic families have been identified as suitable for designing more robust and energy efficient subthreshold circuits with some tradeoff. (i)Subthreshold CMOS logic.(ii) Subthreshold pseudo-NMOS logic.(iii) Variable threshold voltage (VT) subthreshold CMOS logic.(iv) Subthreshold DTMOS logic.(v) Subthreshold Domino logic.(vi) Subthreshold Pass Transistor (PT) logic.(vii) Subthreshold Dynamic Threshold PT (DTPT) Logic.

##### 5.1. Subthreshold CMOS Logic

Sub-threshold CMOS (Sub-CMOS) logic is the conventional CMOS logic operated in the subthreshold region. The voltage transfer characteristics (VTC) of the inverter gate running in subthreshold mode is closer to ideal compared to the VTC in normal strong inversion region [19]. The improvement is mainly caused by the increase in the circuit gain. The exponential relationship between and in subthreshold region gives rise to an extremely high transconductance, . The much improved VTC yields better noise margins. Circuit designers can have more freedom in sizing the circuits and still obtain a near optimum delay value than strong inversion CMOS due to the wider range of flatness of PMOS to NMOS ratio [19]. Sensitivity to Power Supply Variation has a significant negative impact on subthreshold circuit as the sensitivity of the gate delay due to variation increases by a factor of 8 with decreasing power supply value for subthreshold CMOS logic [19]. Hence, stabilization is crucial for the proper operation of subthreshold circuit.

##### 5.2. Subthreshold Pseudo-NMOS Logic

In subthreshold region, Pseudo-NMOS logic is more robust than Pseudo-NMOS logic in strong-inversion, as its VTC is more closer to the ideal curve and also the voltage levels swing rail-to-rail due to large gain in subthreshold region, and does not suffer from low logic level degradation problem as with the case of the strong inversion case and also Pseudo-NMOS operates faster than CMOS consuming less area [19]. Two main disadvantages of Pseudo-NMOS in strong inversion as compared to CMOS are higher power consumption and less robustness, which are eliminated in subthreshold region due to ideal device characteristics. In summary, Pseudo nMOS for subthreshold has better PDP and comparable robustness to static CMOS in subthreshold region.

##### 5.3. VT Sub-CMOS Logic

To ensure proper operations under different temperature and process variations, two subthreshold logic families, namely, Variable Threshold voltage Sub-threshold CMOS logic (VT-Sub-CMOS logic) and Sub-threshold Dynamic Threshold voltage logic (Sub-DTMOS logic) have been proposed [20]. Both logic families show a significant improvement in stability to temperature and process variations while maintaining the same ultra low-power design constraint. VT-Sub-CMOS logic is sub-CMOS logic with an additional stabilization scheme. The stabilization circuit monitors any change in the transistor current due to temperature and process variations and provides an appropriate bias to the substrate. Any increase of the current above certain prespecified threshold value is thus reduced by an appropriate bias to the substrate. Both logic and stabilization circuits of VT-sub-CMOS work in the subthreshold region, that is, with a supply voltage less than the threshold voltage of the transistor (). With proper substrate biasing, a stable operation can thus be achieved in VT-Sub-CMOS logic, thereby increasing the robustness of the circuit. However, the stabilization scheme incurs an additional overhead in area and circuit complexity.

Table 7 shows that for 10% change in , the amount of change in the energy/switching (PDP) for strong inversion CMOS logic ranges from 0.1% to 1.4%, from 34.7 to 96.2% for Subthreshold CMOS, and only 5 to 42.4% for VT-Sub-CMOS logic showing improvement in VT-subCMOS tolerance to variations. For a temperature change from 25 to 100, the energy/switching of strong inversion CMOS logic changes only by 28.2%. Sub-CMOS logic shows a change of 61.5% in its energy/switching, and VT-Sub-CMOS logic shows a change only of 33.7% [20].

##### 5.4. Sub-DTMOS Logic

Sub-DTMOS logic provides an alternative way to achieve the same stability with direct substrate biasing without using additional control circuitry as in the case of VT-sub-CMOS logic. Sub-DTMOS logic uses transistors whose gates are tied to their substrate [21]. As the substrate voltage in sub-DTMOS logic changes with the gate input voltage, the threshold voltage is dynamically changed. In the OFF-state, that is, = 0 ( = ) for NMOS (PMOS), the characteristics of DTMOS transistor are exactly the same as regular MOS transistor. Both have the same properties, such as the same off-current, subthreshold slope, and threshold voltage. In the ON-state, however, the substrate-source voltage is forward-biased and thus reduces the threshold voltage of DTMOS transistor. The reduced threshold voltage is due to the reduction of body charge. The reduction of body charge leads to another advantage, namely higher carrier mobility because the reduced body charge causes a lower effective normal field. The reduced threshold voltage, lower normal effective electric field, and higher mobility results in higher ON-current drive in DTMOS than that of a regular MOS transistor. Furthermore, the subthreshold slope of DTMOS improves and approaches the ideal 60 mV/decade which makes it more efficient in subthreshold logic circuits to obtain higher gain [21]. Another significant advantage of the sub-DTMOS logic is that it does not require any additional limiter transistors, which further reduces the design complexity. In contrast, in the normal strong inversion region, the limiter transistors are necessary to limit the forward-biased to be less than 0.6 V. This is to prevent forward-biasing the parasitic PN junction diode while allowing a much higher power supply to be used in the circuit. The PDP of DTMOS is comparable to the PDP of regular CMOS [21]. Thus, using DTMOS logic, we can operate the circuit at much higher frequency while still maintaining the same energy/switching with enhanced robustness compared to static CMOS.

##### 5.5. Subthreshold Domino Logic

Sub-threshold static and ratioed logic has recently been proposed to satisfy the ultra-low-power requirement in applications such as hearing aid, pace-maker, and wearable wrist-watch computer. These logic circuits, however, can be operated only at lower frequencies due to lower supply voltage. To increase the frequency of operation, subthreshold dynamic logic: Subdomino logic has been proposed [22]. A standard full adder circuit implemented in both Subdomino and Sub-CMOS logic operating in the subthreshold region has been simulated. Results from Table 8 show that Subdomino logic has lower power consumption (32% of sub-CMOS), smaller area (60% of Sub-CMOS logic), and is 3 times faster than Sub-CMOS logic. It has also been shown that Subdomino logic has excellent noise margin [22].

##### 5.6. Subthreshold DTPT Logic

For the pass transistor logic, we can use dynamic threshold transistors whose gates are tied to the substrates forming the subthreshold dynamic threshold pass transistor (Sub-DTPT) [24]. It has been observed that Sub-DTPT logic shows better stability to the temperature variation than the corresponding subPass Transistor logic. For example, in the second XOR structure in [24], the delay reduction caused for a 100 temperature increase is 17.8% for sub-PT and just 7.2% for sub-DTPT logic.

#### 6. Conclusions

As supply voltage continues to scale with each new generation of CMOS technology, Sub-threshold design is an inevitable choice in the semi-conductor road map for achieving ultra low-power consumption. Device optimization is a must for optimal subthreshold operation to further reduce power and enhance performance. Comparative studies shows that double gate SOI devices and CNFETs are better candidates to work for subthreshold operation than Bulk CMOS devices. At circuit-level, Sub-Pseudo-NMOS, Sub-DTPT and Subdomino logics can be considered for robust subthreshold operation due to their improved performance and better stability for PVT variations with reduced or comparable energy/switching to that of conventional static CMOS logic. Device/Circuit Codesign methodology can further enhance subthreshold operation in terms of performance and robustness.