Research Article  Open Access
A Modified Implementation of Tristate Inverter Based Static MasterSlave FlipFlop with Improved PowerDelayArea Product
Abstract
The paper introduces novel architectures for implementation of fully static masterslave flipflops for low power, high performance, and high density. Based on the proposed structure, traditional C^{2}MOS latch (tristate inverter/clocked inverter) based flipflop is implemented with fewer transistors. The modified C^{2}MOS based flipflop designs mC^{2}MOSff1 and mC^{2}MOSff2 are realized using only sixteen transistors each while the number of clocked transistors is also reduced in case of mC^{2}MOSff1. Postlayout simulations indicate that mC^{2}MOSff1 flipflop shows 12.4% improvement in PDAP (powerdelayarea product) when compared with transmission gate flipflop (TGFF) at 16X capacitive load which is considered to be the best design alternative among the conventional masterslave flipflops. To validate the correct behaviour of the proposed design, an eight bit asynchronous counter is designed to layout level. LVS and parasitic extraction were carried out on Calibre, whereas layouts were implemented using IC station (Mentor Graphics). HSPICE simulations were used to characterize the transient response of the flipflop designs in a 180 nm/1.8 V CMOS technology. Simulations were also performed at 130 nm, 90 nm, and 65 nm to reveal the scalability of both the designs at modern process nodes.
1. Introduction
Flipflops are the key elements used in sequential digital systems. The appropriate selection of flipflop topologies is instrumental in the design of VLSI integrated circuits such as microprocessors, microcontrollers, and other high complexity chips. However, factors such as high performance, low power, transistor count, clock load, design robustness, powerdelay, and powerarea tradeoffs are generally considered before choosing a particular flipflop design. The highest operating frequency of clocked digital systems is determined by the flipflops. Flipflops and clock distribution network generally account for 30–70% of the total chip power consumption [1, 2]. Clock load is another major concern for digital system designers and several contributions have been reported in the past to reduce clock load and the associated power dissipation in the clocking network [3–5]. A design with elevated transistor count occupies a larger area on chip and leads to an increase in the overall manufacturing cost. Hence, design and implementation of low power high performance flipflops with the least possible chip area is the main target of the modern chip manufacturing industry.
Flipflops are broadly classified into three main categories, namely, masterslave [6–11], pulse triggered [12–17], and differential flipflops [18–21]. Among them, masterslave and pulsetriggered flipflops are the most efficient in terms of powerdelay product. Masterslave flipflops exhibit positive (negative) setup time (hold time) requirements and hence not suitable for high speed systems due to extended data to output delays. But they are power efficient and can be used in low power applications. However, their main limitation is less robustness to clock skew. Pulsetriggered flipflops have negative setup time and thus lead to smaller data to output delay. They exhibit inherent soft clock edge property which minimizes clock skew related cycle time loss.
A classification of masterslave flipflops is further elaborated in Figure 1. Clockgated topologies exhibit internal clock gating to suppress the power consumption at lower data switching activities based on a clock gating logic and a comparator circuit. However, clock gated flipflops have extended latency due to enhanced clock to output delays along with increased chip area overhead. Clock gated structures generally consume lesser power at low switching activities [22]. TGFF represents the best choice in the nonclock gated flipflop category in terms of powerdelay product [6], whereas existence of NMOS transistors in the critical path along with partially nongated keepers leads to less significant powerdelay tradeoff characteristics in case of write port masterslave flipflop (WPMS) [7, 8] and pass transistor logic based flipflop (PTLFF) [9].
In this paper, we introduce an alternative design approach for designing C^{2}MOS based masterslave flipflop, based on a new architecture with reduced transistor count and improved powerdelayarea product. The proposed configurations mC^{2}MOSff1 and mC^{2}MOSff2 fall under the nonclock gated flipflop category as shown in Figure 1.
The rest of the paper is organized as follows. Section 2 compares the conventional masterslave flipflop configurations with proposed designs. Section 3 highlights the simulation parameters and test bench along with techniques used for transistor sizing and methodology adopted for optimization of timing and powerdelay product. Section 4 describes the simulation results. Section 5 concludes the paper. An appendix is added to show calibration of parameters for delay calculations using LE theory and to outline the strategy followed for designing the eightbit ripple counter.
2. Overview of Previous Work and Proposed Designs
Figure 2 shows the conventional masterslave flipflop architecture, whereby two regenerative loops (L1 and L2) are present in the master and slave sections to account for a static functionality. Both loops operate independently of each other on complementary clock signals. Regenerative loops are composed of cross coupled inverters. It can be observed from Figure 2 that for each loop, regenerative action is achieved through one inversion in the forward (critical) path while the other (clocked) inversion takes place in the feedback path. Moreover, there is no common component between both loops.
Since an inverter followed by transmission gate is equivalent to a clocked inverter, the combination is replaced by a clocked inverter to form a C^{2}MOS based flipflop architecture as shown in Figure 3 [23]. Two regenerative loops L3 and L4 are used in a similar manner as in the previous case to maintain the static nature of the flipflop.
However, in the proposed architecture as reported in Figure 4(a), both inversions take place in the forward (critical) path and the loop is completed by a clocked switch for loop L6 while loop L5 is completed by using an inverter in the feedback path. It is clearly noticed from Figure 4(a) that the output node is always driven and never floating thus ensuring a static flipflop operation. The size of transistors in the feedback path marked by asterisks (*) is kept at 360 nm (minimum technology width) to eliminate race conditions at nodes U and V. Yet another implementation is shown in Figure 4(b) which uses inverter INVX in the critical path and a clocked switch to form a regenerative loop L7. It is to be noted that INVX is common to both the regenerative loops L7 and L8 which is contrary to the realization of previous architectures.
(a)
(b)
Figure 5 represents the actual circuit design based on the proposed architectures in Figure 4, while TGFF is implemented using transmission gates as switches in the conventional architecture as demonstrated in Figure 6.
(a) mC2MOSff1
(b) mC2MOSff2
It can be clearly observed that mC^{2}MOSff1 and mC^{2}MOSff2 both are realized using sixteen transistors each. As a result, the area occupied by the proposed designs is significantly lesser than the conventional designs. Moreover, the number of clocked transistors in mC^{2}MOSff1 is six as compared to eight in case of TGFF or conventional clocked inverter based flipflop C^{2}MOSff [23].
To illustrate the superior performance of the proposed flipflop configurations, other flipflop topologies, namely, TGFF, WPMS, PTLFF, gated masterslave latch (GMSL) [10], and data transition look ahead flipflop (DTLA) [11] belonging to the masterslave class have been used for comparisons. Out of the above mentioned topologies GMSL, and DTLA represent flipflops with internal clock gating. Schematic diagrams of WPMS, PTLFF, GMSL and DTLA are shown in Figures 7, 8, 9, and 10, respectively.
3. Simulation Parameters, Test Bench, and Optimization Methodology
Table 1 lists the CMOS parameters used for creating the simulation environment. The flipflops were designed to layout level in 180 nm/1.8 V CMOS process at 250 MHz clock frequency. The width of transistors in the feedback structures was invariably fixed at the minimum value 360 nm while the slope of the data and clock signals was kept at 100 ps. Performances of the various flipflop configurations are evaluated through SPICE simulation of the circuits extracted from the layout with the inclusion of parasitics.

Figure 11 shows the simulation test bench for characterization and comparison of the FF designs [3]. The clock and data signals are fed to the flipflop through a two stage buffer. Datatooutput delay () is used for performance comparisons. Logical effort theory is extensively used for designing fast CMOS circuits based on pencil and paper calculations and is widely adopted in the literature [24]. Hence, the delay sensitivity factor introduced by Alioto et al. [25] based on logical effort theory has been used for performance optimization.
A 16cycle long pseudorandom sequence with a switching factor is supplied at the data input for measurement of average power [26]. Since the delay and power characterization are strongly dependent on the capacitive load offered to FFs [27], varying capacitive loads {4, 16, 64} , where is the input capacitance of a symmetrical minimum inverter (), have been used to test the FF behaviour. Transistor sizing methodology adopted is the same as that in [28, 29], whereas powerdelay product (PDP) and powerdelayarea product (PDAP) are the chosen figures of merit (FOM).
The expression relating the absolute gate capacitance () in terms of fF (femtofarads) and absolute transistor width () in terms of nanometers (nm) obtained at 180 nm process node by fitting simulation data [30] is given as LE method states that the optimized delay of a path of cascaded stages is where , , () are the logical effort, branching effort, and electrical effort while , (= ) and are parasitic delay, path effort, and final load capacitance, respectively. One has the following: From (2) and (4), where represents the relative delay increment with respect to parasitic delay. Equations (4) and (5) indicate that larger values of lead to a saturation in the optimized delay and based on the above analysis, the delay sensitivity factor introduced by Alioto et al. [25] is utilized to obtain the upper bound on the transistor widths for exploration of the powerdelay design space with least computational effort. Consider the following: where is the delay sensitivity factor and is obtained from (3) to (5). The upper bounds on the normalized transistor widths (normalized with respect to ) have been obtained such that the delay sensitivity remains under a minimum value which is chosen as −5% for our analysis. The input capacitance of the flipflop is expressed in terms of normalized width as follows:
Figure 12 shows the conventional TGFF design. The sizing is done by assuming the transistors in the critical path to be independent design variables (IDVs) and optimizing for maximum performance using LE theory. The inverter before transmission gate in the first stage protects the input terminal from noise variations [31]. Table 2 exhibits delay variation for increasing values. It is noteworthy that the delay saturates at 153 ps for = 24.8 fF. As a result, the upper bounds on transistor widths are exposed and the limits of power (energy)delay design space are defined early in the design cycle [32]. The table also includes the corresponding power dissipation along with the powerdelay product and it is observed that minimum powerdelay product is obtained at = 9.92 fF. The technology parameters used for capacitance calculations throughout this paper are listed in Table 3.


4. Results and Discussion
It is a wellestablished fact that the conventional C^{2}MOS although slower, is skew tolerant and occupies lesser area than TGFF [23, 33]. Moreover, mC^{2}MOSff1 and mC^{2}MOSff2 show nearly identical characteristics in terms of power, delay, and area and hence only mC^{2}MOSff1 is considered for comparisons.
The waveforms in Figure 13 represent the transient analysis of mC^{2}MOSFF1 carried out over a period of 8 clock cycles. The SPICE simulation results verify the correct flipflop operation at 1 GHz clock frequency (all the flipflops reported in the paper are designed for negative edge triggered operation). The variation of absolute datatooutput delays with FF input capacitance () for 16X (19.92 fF) capacitive load is illustrated in Figure 14.
TGFF utilizes transmission gates in the critical path and hence it is faster than the rival designs. There is exactly the same number of stages in the critical path of TGFF and mC^{2}MOSff1, the only difference being that the latching circuit in case of TGFF is an inverter followed by a clocked transmission gate (inverting latch), whereas a clocked/tristate inverter is present in mC^{2}MOSff1. Logical effort of both the latches is considered to be two; however, it is apparent that an inverter followed by a transmission gate is faster because the output node is driven by both the transistors of the transmission gate in parallel and this behaviour is reflected in Figure 14. From the above discussion, it is obvious that the value of logical effort for an inverting latch can be assumed to be two for most theoretical purposes, but for comparison with a C^{2}MOS latch, it must be slightly less than two if delays are to be modelled precisely.
Equation (2) clearly indicates that lesser branching effort leads to a faster circuit operation. The branching effort for a path with internal fanout is expressed as [24] where represents the load capacitance along the path under analysis and represents the capacitance of the connections that lead off the path.
The branching effort along the critical path is given as
There are two branches each in TGFF and mC^{2}MOSff1 represented as , and , in Figures 6 and 5(a), respectively. The branching effort corresponding to branches , , , and is calculated as follows.
4.1. Branching Effort in Case of TGFF
One has the following. Calculation: . Calculation: , .
4.2. Branching Effort in Case of mC^{2}MOSff1
One has the following. Calculation: . Calculation: , ,where is gate to drain capacitance, is drain to body capacitance, and is the gate capacitance of respective transistors.
Accordingly, using (2) and putting , , , , and , we have (absolute delay 165.1 ps) for TGFF, whereas putting , , , , and , we have (absolute delay 166.27 ps) for mC^{2}MOSff1. Absolute delays are obtained by multiplying parameter with parameter as follows:
It is clearly observed that the delay of mC^{2}MOSff1 is marginally higher than the delay of TGFF. Now, keeping other parameters to be the same and assuming the logical effort of inverting latch to be 1.8, the updated value of TGFF is evaluated as (absolute delay 160.55 ps).
The value of process dependent parameter is determined as approximately 13 ps using the calibration technique as mentioned by Sutherland et al. [24]. The detailed procedure is discussed in the Appendix. The absolute delay measurements obtained through simulation are 162 ps for TGFF and 196 ps for mC^{2}MOSff1 which is in close agreement with the theoretical values 160.55 ps and 166.27 ps, respectively (typically within 15% error).
WPMS and PTLFF topologies show degraded performance due to the presence of pass transistors in the critical path while the speed of clockgated structures is worst mainly because gating circuit is inserted between the clock and the flipflop terminals which deteriorates the timing characteristics. The characterizations are done assuming that and (16X) where represents the flipflop load capacitance.
The variation of average power with for 16X loading condition is depicted in Figure 15. Due to threshold voltage drop at internal nodes, WPMS and PTLFF display worst power dissipation characteristics because of short circuit power dissipation. GMSL and DTLA exhibit greater power dissipation than nongated counterparts because pseudorandom sequence has an activity factor of 0.5. The reason being the presence of additional comparator and clock gating circuit which is beneficial only at sufficiently low switching activities or otherwise leads to both increased area and power overhead.
4.3. Clock Load Calculations
One has the following.
TGFF: {Transistors contributing towards clock load in the critical path} + {Transistors contributing towards clock load in the feedback structure} = 14.78 fF + 1.66 fF = 16.44 fF.
mC^{2}MOSff1: {Transistors contributing towards clock load in the critical path} + {Transistors contributing towards clock load in the feedback structure} = 22.18 fF + 0.84 fF = 23.02 fF.
Apart from the clock load, the capacitance value at internal nodes of mC^{2}MOSff1 is reduced as compared to TGFF by eliminating transistors TN6 and TP6 from the feedback structure.
4.4. Capacitance Calculations at Internal Nodes of TGFF
Internal Capacitance at Nodes and Node P: (TN2) + (TP2) + (TN5) + (TN5) + (TP5) + (TP5) = 9.28 fF. Node K: (TN6) + (TP6) + (TN9) + (TP9) = 9.02 fF.
Internal Capacitance at Nodes and Node M: (TN1) + (TN1) + (TP1) + (TP1) + (TN3) + (TN3) + (TP3) + (TP3) + (TN4) + (TP4) = 18.41 fF. Node N: (TN5) + (TN5) + (TP5) + (TP5) + (TN7) + (TN7) + (TP7) + (TP7) + (TN8) + (TP8) = 14.80 fF.
4.5. Capacitance Calculations at Internal Nodes of mC^{2}MOSff1
Internal Capacitance at Nodes P’ and K’ Node P’: (TN12) + (TP12) = 9.76 fF. Node K’: (TN13) + (TP13) + (TN14) + (TP14) + (TN16) + (TP16) + (TN16) + (TP16) = 10.06 fF.
Internal Capacitance at Node M’ Node M’: (TN15) + (TP15) = 12.35 fF.
It can be easily concluded from calculations above that a total of 19.34 fF capacitance has been reduced from the internal nodes in the critical path of mC^{2}MOSff1 in comparison to TGFF. This leads to reduced internal power dissipation at these nodes as lesser capacitance has to be charged or discharged per clock cycle. However, reduction in the clock load of mC^{2}MOSff1 due to transistors eliminated from the feedback structure is nullified due to PMOS transistors TP10 and TP11 whose size is twice that of transistors TP1 and TP5 in case of TGFF and as a result the total power dissipation of both the flipflops is nearly the same as it can be clearly observed from Figure 16. Following a similar procedure, the clock load of various flipflops is obtained and listed in Table 4 along with number of clocked transistors and power consumption values. It is seen that TGFF and mC^{2}MOSff1 represent the most efficient designs in terms of reduced power consumption having power dissipation comparable to DTLA at and .
 
Pseudorandom sequence with is used for power calculations. 
It can be observed that mC^{2}MOSff1 has the least transistor count along with PTLFF while GMSL and DTLA consist of maximum number of transistors. Since only sixteen transistors are used for circuit realization of mC^{2}MOSff1, power dissipation is comparable to TGFF. It is worth noting that GMSL and DTLA offer minimum clock load, as a result, these topologies exhibit least power dissipation at lower switching activities. The reason for extended clocktooutput delays of GMSL and DTLA is the insertion of clock gating circuitry while DTLA has a pulsed operation and hence shows negative setup time requirements. Based on the power and delay measurements, powerdelay product characteristics are derived for all the flipflops as shown in Figure 16. The optimum powerdelay product of gated structures GMSL and DTLA is, respectively, 3.30x and 3.34x times greater than optimum PDP of TGFF. Among the nonclock gated structures, pass transistors based designs WPMS and PTLFF exhibit 1.77x and 1.57x enhancement in the powerdelay product with respect to the benchmark flipflop TGFF. TGFF also shows 20% improvement over mC^{2}MOSff1 in terms of minimum powerdelay product. However, despite the fact that TGFF represents a better alternative in terms of performance and optimum powerdelay product, the area requirements also remain a major concern. It has been observed in the literature that conventional C^{2}MOS based flipflop is up to 20–25% more efficient in terms of occupied chip area. This stems mainly from the fact that at layout level (i) in comparison to TGFF, diffusion areas of most of the transistors can be shared in C^{2}MOS flipflop [33], (ii) the number of contact holes can be reduced in the layout pattern [23], and (iii) less complicated feedback structure leads to fewer interconnections.
The layouts were implemented using , indicating almost similar transistor sizes throughout the critical path with the exception of TP10 and TP11 belonging to mC^{2}MOSff1 which are twice in size compared to TP1 and TP5 in accordance with the LE theory. The layouts for TGFF and mC^{2}MOSff1 are shown in Figures 17 and 18, respectively. Table 5 clearly shows that while TGFF is better in terms of PDP by 18.4%, mC^{2}MOSff1 shows a 12.4% improvement in the PDAP making it suitable for high density applications where performance can be compromised.

The power dissipation results as illustrated in Figure 19 are obtained using which ensures that all the transistors in the critical path have similar widths. At zero switching activity, clockgated topologies are the most power efficient. GMSL and DTLA show GMSL 32.5% and 46.3% reduction in power in case of logic high at the input, whereas for logic low, the power consumption is reduced by 19.2% and 35.4%, respectively. Again, it can be clearly observed that there is only a slight difference in the power dissipation of TGFF and mC^{2}MOSff1 at different switching activities.
The correct functionality of the proposed flipflop mC^{2}MOSff1 is validated by designing an 8bit ripple counter at 16X capacitive load and the average power measurements were carried out over 256 clock cycles. It was noticed that the power consumption of the mC^{2}MOSff1 based counter is comparable to the TGFF at varying frequencies. Again, LE theory has been adopted for sizing individual flipflops in each counter for optimum performance which is expressed in detail in the Appendix.
The flipflops were also designed and simulated to layout level with inclusion of parasitics at 130 nm, 90 nm, and 65 nm CMOS processes to address scalability issues at more advanced process nodes. The simulation test bench and optimization methodology are similar as mentioned in Section 3. PVT variations are emphasized to evaluate the performance of flipflops at all process corners, namely, FF, SS, FS, and SF with voltages scaled from 0.9 to 1.1 V while the temperatures varied from 0 to 125 degrees as shown in Table 6. The simulation and technology parameters are also listed in Table 6 where represents the capacitance per unit gate oxide and was evaluated to be 1.3 fF/um by fitting simulation data. In addition, the capacitances per unit length of poly, metal 1 and metal 2 interconnects are also mentioned.

For illustration purposes, the delay and power variations with the flipflop input capacitance with respect to different process corners at 65 nm CMOS technology for mC^{2}MOSff1 are demonstrated in Figures 20 and 21, respectively, at 16X capacitive loading. Both mC^{2}MOSff1 and mC^{2}MOSff2 showed correct circuital behaviour at the aforementioned process nodes which indicates that no internal noise violations exist especially due to the fact that logic levels are retained even at FF process corner. However, it is to be pointed out that mC^{2}MOSff1 in a manner similar to TGFF starts to fail at SS corner for lower values of [34].
5. Conclusion
In this paper, an alternative architecture for designing C^{2}MOS based flipflops is presented with a modified feedback strategy while preserving the fully static operation. Using the new feedback approach, a modified topology mC^{2}MOSff1 is proposed with decreased parasitic capacitances at internal nodes in comparison to the TGFF which is the finest design in terms of PDP. However, postlayout simulations and analyses indicate that the modified configuration mC^{2}MOSff1 presents the best alternative in terms of PDAP among all the conventional designs. Therefore, for high performance applications, TGFF still remains the best choice but it can be replaced by mC^{2}MOSff1 for high density applications. Comparisons were carried out with stateoftheart flipflops in the masterslave class. The simulation results are well supported with mathematical analysis based on logical effort theory within acceptable error (typically less than 15%).
Appendices
A. Delay Calibration Using LE Theory
For modelling delays using LE theory initially, all the delays are expressed in terms of a basic delay unit which is process dependent such that the absolute delay is represented as the product of a unit less delay of the gate as shown in (2), and the delay unit . Accordingly, While represents the delay for a multistage path, corresponds to the delay of a single stage logic gate. Parameter needs to be estimated in order to obtain absolute delays and accordingly a delay versus fanout curve is determined for an inverter as shown in Figure 22 by fitting simulation data. The curve is approximated as a straight line and the slope of the line represents since and logical effort of an inverter is 1. In our case, is estimated as 13 ps.
B. Implementation of 8Bit Ripple Counter
An 8bit asynchronous counter was implemented by converting the D flipflop configuration to a T flipflop configuration using an EXOR gate as illustrated in Figure 23.
The T flipflop designed using TGFF is shown in Figure 24. It is considered to be a five stage design and optimized for highest speed using LE theory. The EXOR gate was realized using transmission gates as revealed in Stage 1 of Figure 24. A similar procedure was followed for designing mC^{2}MOSff1 based T flipflop.
For designing the modulo 256 counter, the output of each stage is connected to the clock terminal of the next stage through two intermediate inverters (acting as a buffer) sized ( = 11.52 u, = 5.76 u) such that the input capacitance of the first inverter acts as the load capacitance for the flipflop configuration of the previous stage as depicted in Figure 25. As a result, the load at the output terminal of each flipflop is uniformly fixed at 19.92 fF.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 H. Kawaguchi and T. Sakurai, “A reduced clockswing flipflop (RCSFF) for 63% power reduction,” IEEE Journal of SolidState Circuits, vol. 33, no. 5, pp. 807–811, 1998. View at: Publisher Site  Google Scholar
 G. Yeap, Practical Low Power Digital VLSI Design, Kluwer Academic, 1998.
 V. Oklobdzija, V. Stojanovic, D. Markovic, and N. Nedovic, Digital System Clocking: HighPerformance and LowPower Aspects, WileyIEEE Press, 2003.
 B. Mesgarzadeh, M. Hansson, and A. Alvandpour, “Jitter characteristic in charge recovery resonant clock distribution,” IEEE Journal of SolidState Circuits, vol. 42, no. 7, pp. 1618–1625, 2007. View at: Publisher Site  Google Scholar
 C. Giacomotto, N. Nedovic, and V. G. Oklobdzija, “The effect of the system specification on the optimal selection of clocked storage elements,” IEEE Journal of SolidState Circuits, vol. 42, no. 6, pp. 1392–1404, 2007. View at: Publisher Site  Google Scholar
 G. Gerosa, S. Gary, C. Dietz et al., “2.2 W, 80 MHz superscalar RISC microprocessor,” IEEE Journal of SolidState Circuits, vol. 29, no. 12, pp. 1440–1454, 1994. View at: Publisher Site  Google Scholar
 D. Markovic, J. Tschanz, and V. De, “Transmissiongate based flipflop,” US Patent 6642765, 2003. View at: Google Scholar
 S. K. Hsu, S. K. Mathew, M. A. Anders et al., “A 110 GOPS/W 16bit multiplier and reconfigurable PLA loop in 90nm CMOS,” IEEE Journal of SolidState Circuits, vol. 41, no. 1, pp. 256–264, 2006. View at: Publisher Site  Google Scholar
 R. Hossain, L. D. Wronski, and A. Albicki, “Low power design using double edge triggered flipflops,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 2, no. 2, pp. 261–264, 1994. View at: Publisher Site  Google Scholar
 A. G. M. Strollo, E. Napoli, and D. de Caro, “Lowpower flipflops with reliable clock gating,” Microelectronics Journal, vol. 32, no. 1, pp. 21–28, 2001. View at: Publisher Site  Google Scholar
 M. Nogawa and Y. Ohtomo, “A datatransition lookahead DFF circuit for statistical reduction in power consumption,” IEEE Journal of SolidState Circuits, vol. 33, no. 5, pp. 702–706, 1998. View at: Publisher Site  Google Scholar
 F. Klass, C. Amir, A. Das et al., “A new family of semidynamic and dynamic flipflops with embedded logic for highperformance processors,” IEEE Journal of SolidState Circuits, vol. 34, no. 5, pp. 712–716, 1999. View at: Publisher Site  Google Scholar
 P. Zhao, T. Darwish, and M. Bayoumi, “Low power and high speed explicitpulsed flipflops,” in Proceedings of the 45th Midwest Symposium on Circuits and Systems, pp. II477–II480, August 2002. View at: Google Scholar
 H. Partovi, R. Burd, U. Salim, F. Weber, L. DiGregorio, and D. Draper, “Flowthrough latch and edgetriggered flipflop hybrid elements,” in Proceedings of the IEEE International SolidState Circuits Conference, pp. 138–139, February 1996. View at: Google Scholar
 R. Heald, K. Aingaran, C. Amir et al., “Thirdgeneration SPARC V9 64b microprocessor,” IEEE Journal of SolidState Circuits, vol. 35, no. 11, pp. 1526–1538, 2000. View at: Publisher Site  Google Scholar
 N. Nedovic, M. Aleksic, and V. G. Oklobdzija, “Conditional techniques for low power consumption flipflops,” in Proceedings of the 8th IEEE International Conference on Electronics, Circuits and Systems (ICECS '01), pp. 803–806, September 2001. View at: Google Scholar
 S. D. Naffziger, G. ColonBonet, T. Fischer, R. Riedlinger, T. J. Sullivan, and T. Grutkowski, “The implementation of the itanium 2 microprocessor,” IEEE Journal of SolidState Circuits, vol. 37, no. 11, pp. 1448–1460, 2002. View at: Publisher Site  Google Scholar
 B.S. Kong, S.S. Kim, and Y.H. Jun, “Conditionalcapture flipflop for statistical power reduction,” IEEE Journal of SolidState Circuits, vol. 36, no. 8, pp. 1263–1271, 2001. View at: Publisher Site  Google Scholar
 S. Shin and B. Kong, “Variable sampling window flipflops for low power highspeed VLSI,” IEE Proceedings of Circuits, Devices and Systems, vol. 152, no. 3, pp. 266–271, 2005. View at: Google Scholar
 B. Nikolić, V. G. Oklobdžija, V. Stojanovič, W. Jia, J. K.S. Chiu, and M. M.T. Leung, “Improved senseamplifierbased flipflop: design and measurements,” IEEE Journal of SolidState Circuits, vol. 35, no. 6, pp. 876–884, 2000. View at: Publisher Site  Google Scholar
 N. Nedovic, V. G. Oklobdzija, and W. W. Walker, “A clock skew absorbing flipflop,” in Proceedings of the IEEE International SolidState Circuits Conference, vol. 1, pp. 342–344, February 2003. View at: Google Scholar
 A. G. M. Strollo and D. de Caro, “Low power flipflop with clock gating on master and slave latches,” Electronics Letters, vol. 36, no. 4, pp. 294–295, 2000. View at: Publisher Site  Google Scholar
 Y. Suzuki, K. Odagawa, and T. Abe, “Clocked CMOS Calculator Circuitry,” IEEE Journal of SolidState Circuits, vol. SC8, no. 6, pp. 462–469, 1973. View at: Google Scholar
 I. Sutherland, B. Sproull, and D. Harris, Logical Effort: Designing Fast CMOS Circuits, Morgan Kaufmann, Los Altos, Calif, USA, 1998.
 M. Alioto, E. Consoli, and G. Palumbo, “General strategies to design nanometer flipflops in the energydelay space,” IEEE Transactions on Circuits and Systems I, vol. 57, no. 7, pp. 1583–1596, 2010. View at: Publisher Site  Google Scholar
 V. Stojanovic and V. G. Oklobdzija, “Comparative analysis of masterslave latches and flipflops for highperformance and lowpower systems,” IEEE Journal of SolidState Circuits, vol. 34, no. 4, pp. 536–548, 1999. View at: Publisher Site  Google Scholar
 S. Heo and K. Asanovic, “Loadsensitive flipflop characterization,” in Proceedings of the IEEE Computer Society Workshop on VLSI, pp. 87–92, 2001. View at: Google Scholar
 M. Alioto, E. Consoli, and G. Palumbo, “Analysis and comparison in the energydelayarea domain of nanometer CMOS FlipFlops. Part I: methodology and design strategies,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19, no. 5, pp. 725–736, 2011. View at: Publisher Site  Google Scholar
 M. Alioto, E. Consoli, and G. Palumbo, “Analysis and comparison in the energydelayarea domain of nanometer CMOS FlipFlops. Part II: results and figures of merit,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19, no. 5, pp. 737–750, 2011. View at: Publisher Site  Google Scholar
 G. Palumbo and M. Pennisi, “Design guidelines for highspeed transmissiongate latches: analysis and comparison,” in Proceedings of the 15th IEEE International Conference on Electronics, Circuits and Systems (ICECS '08), pp. 145–148, September 2008. View at: Publisher Site  Google Scholar
 E. Consoli, G. Palumbo, and M. Pennisi, “Reconsidering highspeed design criteria for transmissiongatebased masterslave flipflops,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, no. 2, pp. 284–295, 2012. View at: Publisher Site  Google Scholar
 M. Alioto, E. Consoli, and G. Palumbo, “From energydelay metrics to constraints on the design of digital circuits,” International Journal of Circuit Theory and Applications, vol. 40, pp. 815–834, 2012. View at: Google Scholar
 H. J. Chao and C. A. Johnston, “Behavior analysis of CMOS D flipflops,” IEEE Journal of SolidState Circuits, vol. 24, no. 5, pp. 1454–1458, 1989. View at: Publisher Site  Google Scholar
 H. Q. Dao, K. Nowka, and V. G. Oklobdzija, “Analysis of clocked timing elements for dynamic voltage scaling effects over process parameter variation,” in Proceedings of the International Symposium on Low Electronics and Design (ISLPED '01), pp. 56–59, August 2001. View at: Google Scholar
Copyright
Copyright © 2014 Kunwar Singh et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.