Abstract
Device mismatch and process variation models play a key role in determining the functionality and yield of sub-100βnm design. Average characteristics are often of interest, such as the average leakage current or the average read delay. However, detecting rare functional fails is critical for memory design and designers often seek techniques that enable accurately modeling such events. Extremely leaky devices can inflict functionality fails. The plurality of leaky devices on a bitline increase the dimensionality of the yield estimation problem. Simplified models are possible by adopting approximations to the underlying sum of lognormals. The implications of such approximations on tail probabilities may in turn bias the yield estimate. We review different closed form approximations and compare against the CDF matching method, which is shown to be most effective method for accurate statistical leakage modeling.
1. Introduction
With technology scaling, memory designs are the first to suffer from process variation. Density requirements in memory cells, latches, and register files aggravate variability effects and often undergo performance and yield degradation. More recently, the trend is starting to become more visible in peripheral logic [1], and random variations can lead to reduced noise margins. From a leakage perspective, general interest has been to model the statistical leakage distribution of the full-chip [2]. However, under extreme device conditions, leakages may inflict delays or faulty behavior: example false reads. In memory designs, bitline leakages can further aggravate the variability effects and impact readability of the design. Often many devices are stacked on the bitlines, and it has become necessary to statistically account for and model those many independent and, in most cases, identical, leaky sources. A typical array column can have around 16 to 64 cells per bitline. With the aim of capturing rare fail events accurately and modeling the variability space [3], it becomes important to emulate those independent variability sources by a single statistically equivalent device, the main goal here being to reduce the dimensionality space. This is especially important for response surface modeling-based integration methods or variance reduction techniques.
The threshold voltage variation due to random dopant fluctuation [4] for the single device, however, is different from that of the equivalent device. In a typical linear model, the equivalent statistical model would be derived from linear relationships with respect to the individual devices, and what the designers often refer to as the square root law; as the device area () increases, the threshold voltage variation scales ~. This would let us model ββ small devices with one large device whose variation is ~ times the original variation. However, as was indicated in [5, 6] this model lacks accuracy when modeling the highly nonlinear leakage currents where the dependency is exponential on the threshold voltage (the source of variation). In fact, as we will explain in the following sections, the leakage current is distributed lognormally, and the problem of the equivalent device is known mathematically as the sum of lognormals. Mathematically, a true and exact closed form does not exist for the method, and historically there have been many approximations. There is a lot of literature on the sum of lognormals, and applications span a wide range of domains including economics, finance, actuarial sciences, and others, with engineering being one of the oldest. The most famous approximations to the sum of lognormals are Fenton-Wilkinsonβs [7] and Schwartz-Yeh (SY) [8]. For circuit design, those models have been studied in [5, 6] to enable modeling the threshold voltage distribution of the equivalent device. The authors in [5] relied on the Fenton-Wilkinsonβs (FW) method to maintain a reasonable 3-sigma estimate (99th percentile). It is often indicated that when βdB > 4, FW method may not result in proper approximation of the lognormal distribution. On the other hand, the authors in [6] proposed to use the Schwartz-Yeh (SY) methodology to model the equivalent threshold voltage distribution of a 3-fin device. Again, the approximation is qualified, in this case for a small number of summands, based on a thousand samples thereby ignoring the accuracy of the model in the tail regions which is critical to estimation of low fail probabilities. In the following sections, we review the modeling trends in the critical regions, focusing on the cumulative density function (CDF) matching method [9] as a viable solution for reliable statistical modeling when the tail region modeling is key to the accuracy of the yield estimate.
The paper is organized as follows. Section 2 introduces a basic circuit example to describe the need for the single equivalent device modeling. Section 3 reviews the mathematical background and assumptions of the most common sum of lognormals approximation methods. It also provides a summary chart of the different approaches under study. Section 4 evaluates the impact on fail probability estimations. Finally conclusions are presented in Section 5.
2. Equivalent Device Modeling
Figure 1 illustrates an 8T cell/register file. The circuit operates like a 6T SRAM cell during cell Write. During Read, the read wordline turns ON. If the cell is storing a β1β on the right-side-node, in this example, the Read stack will turn ON, thereby discharging the read bitline (RBL).
Driven by density requirements, many cells share the same bitline and sense amplifier (or read logic). Only one cell per column is accessed at a given time; other cells remain unaccessed. For the example of Figure 1, the RWL of the unaccessed cells is turned OFF. However, based on the data storage, it is possible that the data stack device turns on, thereby rendering the stack leaky. Figure 2 illustrates a scenario where the accessed cell is OFF. RBL is expected to remain high. However, leakages through the accessed stack and unaccessed stacks may falsely discharge the RBL node. This together with noise, mismatch in the read logic, and other environmental effects can degrade the yield of the system and increase the probability of a false read.
(a)
(b)
Note that this phenomenon is also manifested in other memory and domino logic designs. For example in the case of 6T SRAM, the devices sharing the WBL (see Figure 1) can be storing 1βs thereby leaking onto the WBL and degrading the Read β0β performance in a 6T fashion. Hence there is a need to account for the statistical effects of the leaky devices in any design yield optimization and study. This can dramatically increase the dimensionality of the problem thereby reducing the efficiency of statistical integration, response surface methods, or any methods that try to model the failure region and that suffer from sparsity and the curse of dimensionality. Consider for example a study that involves variability in 64 devices on the bitline along with variability sense amp devices. Thus, a model for a statistically equivalent device to emulate the multiple leaky devices can significantly simplify the complexity. Figure 3 summarizes the problem. Given: devices . The threshold voltage variation is normally distributed with zero mean and standard deviation . That is, Find: distribution for the device such thatβ β is distributed according to ββ.
(a)
(b)
Note that we assume here independent and identical devices; the problem can be generalized to nonidentical/correlated devices.
Leakage Current Model
Leakage current as a function of the threshold voltage variation can be modeled according to (1). Hence it portrays an exponential dependency on the random variable is the electron charge, is boltzmann constant, and is the temperature. For simplicity, has the same device width as the other , but it is followed by a current multiplier to model the equivalent leaky device and solve for the new distribution: the set of (2) represents how the problem can be reduced. Parameters are normal variables, are lognormal variables (see next section for definitions), and (and ) are distributed according to a sum of lognormal. A sum of lognormals is not distributed lognormally, and hence is not normal. In the following section we review the sum of lognormals characteristics and possible methods of approximation that can be used to generate the statistical characteristics of and hence
3. Sum of Lognormals
In this section we will focus on methods to approximate models for (2), and compare their accuracy to the CDF matching approach. First we present some background information.
Lognormal Distribution
Let . If is normally distributed ~, then is lognormally distributed and its probability density function is [10]
Often the Gaussian Variable is defined. is in decibels (dB), and ; if , βdB).
Sum of Lognormals
A common way to compute the probability density function (pdf) of a sum of independent random variables, would be from the product of the characteristic functions, CF, of the summands [11]; a characteristic function of random variable is the expected value . However, for the case of the sum of lognormals, , a closed form of the CF does not exist, and approximations are used instead to represent the density function of the sum of lognormals. Some approximations include modeling the sum of lognormals as an βapproximatelyβ lognormal variable (see (4)); where . is normally distributed ( refers to the method of approximation)
Common techniques are SY and FW methods mentioned earlier. In the following section, we will revisit those methods and compare them to CDF matching. Figure 4 summarizes the four techniques that will be visited: (1)Fenton-Wilkinson method,(2)Schwartz-Yeh method, (3)Log Moment Matching approximation, (4)CDF matching.All these methods except the CDF matching one assume that is a lognormal variable; that is, that (and hence ) is a normally distributed Gaussian variable whose mean and standard deviation are to be found.
3.1. Fenton-Wilkinson
Fenton and Wilkinson method estimates the standard deviation and mean of the Gaussian variable by matching the first two moments of , and , to those of the sum of lognormals (the two sides of (4)) according to (5) It is a well-known fact [10] that given a Gaussian variable, , with mean and standard deviation and then (6) holds by relying on (5) and (6) we can obtain a closed form for and according to (7) for the case of independent variables
3.2. Schwartz and Yeh
Again the methodology starts with the assumption that the sum of lognormals is approximately lognormally distributed where is a Gaussian variable ~. It approximates the mean and standard deviation of the , with numerical recursion; whereas FW estimates the first two moments of . Schwartz and Yeh method relies on the ability to exactly compute the mean, , and standard deviation, , of for the case when the number of lognormal variables in the sum . For , the method then relies on a recursive approach, adding one factor at a time. Hence, the mean and standard deviation of can be derived from the following set of generalized equations (10); is assumed to be normally distributed; , and the are assumed to be uncorrelated in (10) where and are defined in (11). Finally set of (12)-(13) illustrates how the βs can be computed according to [12]; this slightly modified implementation was intended to circumvent the round off error of integration of the original Schwartz and Yeh implementation [13] Thus, at each step of the recursion, we compute the mean and standard deviation of . The integrals are then numerically computed using the functions defined in (13). Their values are used to solve for the βs in order to evaluate (10). The final estimate for the mean and standard deviation is reached at
3.3. Log Moment Matching Approximation
While the previous two approaches find the moment matching using analytical or recursive analytical solutions, this approach relies on sampling instead to compute the moments according to (10) [9]. Similar to the SY method, it estimates the 1st two moments of . It does not pertain to a closed-form solution, but it maintains the lognormal assumption. Thus, the recursive analytical approach in SY and this approximation do not always converge as we will see in Section 3.5, especially that the SY assumptions is held exact when
3.4. CDF Matching
This is the closest possible to the true sum. Unlike the previous approximations, it does not rely on the lognormal approximation. We build a piece-wise linear probability density function from the Monte Carlo samples. Our goal is to demonstrate the difference between this approach and other lognormal approximations, when it comes to tail regions modeling and the resultant probability of fail estimations. Key features of the problem are(i)the piece-wise linear fit for the density function of is non-Gaussian.(a) is defined by pairs , such that , hence .(b)The pwl function can be sparse in the center of the distribution and more dense in the tails to adequately model the low fail probability region. In an extreme fashion, the tail probabilities can be recomputed from tail samples to avoid interpolation errors. (ii)generating the samples is cheap, so is the sample once the function or even tables of are available. Interpolation, bootstrapping, and other techniques can reduce the number of real simulations needed and still enable good confidence in the density function. After all, the previous approaches do rely on the availability of a closed form function for . The number of samples is inversely proportional to the tail probability of interest; for example, if we are looking for accurate probabilities in the range of , then we need to have replications of samples that are larger than . Replications add to the confidence in the expected tail probability. After all the interest is in the CDF tails mainly,(iii)finally, this model can accommodate any complex nonlinear function ; even if it is different from the exponential approximation above, (iv)most importantly, once the distribution of is available, distribution is derived by reverse fitting samples.
Note that for purposes of comparison, in this study all methods share the same exponential current model.
3.5. Theoretical Experiments
In this section, we compare the different methods ability to model (Figure 4). We study both the moment matching abilities as well as the upper and lower CDF tails. The study is performed over different combinations of and (the standard deviation of ): range corresponds to the range 2βdBβ16βdB. Recall that often 4βdB is considered as a critical threshold for how accurate FW methods can model the distribution. Also as demonstrated in [6] standard deviation of the threshold voltage of scaled devices can exceed 4β8βdB (this is based on the leakage coefficient in (1)).
For each () combination, multiple replications of 1 million samples are studied; this enables good estimates for the low fail probabilities. Figure 5 plots the mean and standard deviation of obtained from the different approximations. FW method falls behind for larger (>1 or 4βdB). Recall that this method was intended to match the moments of and not . We note that the FW estimates do underestimate the mean and overestimate the variance of for larger dB values. Finally, the SW and LMM methods do match the CDF well, yet they do differ from each other a bit for larger sigma values, given that the former is a recursive approximation.
(a)
(b)
Figure 6 illustrates histogram (pdf plot) of ; the FW method does not model the body of the ( or ) distribution well compared to other methods; this was also indicated in [6] for small . The trend is even more obvious as the number of summands () increases. However, to obtain a complete picture, there is a need to study the lognorm matching and the tail regions which we will cover next.
(a)
(b)
(c)
(d)
Figure 7 plots the mean and standard deviation of obtained from the different approximations. SY and LMM methods that match moments of falls behind for larger (>1 or 4βdB). This is true for large (>2), and we note that SY tends to particularly underestimate the variance of the sum of lognorms; this effect is most visible when the variables are identical and uncorrelated as is the case in this study. To study the tail regions, we rely on a βtail log plotβ as illustrated in Figure 8. Without loss of generality we set the -axis in these plots to be in dB; a small shift can mean large change in values. The plot is derived from the density function and its complement . It is such tail probabilities that are linked to the fail probabilities of a design. Note that for the case of leakages, the right-side tail is critical for the fails (larger Y values correlate with larger leakage values in real applications).
(a)
(b)
Figure 9 illustrates the tail probability plot as function of for . We get good match for all the methods for small . For critical values (4β8βdB), we note that SY and FW do miss the right tail modeling. As increases, FW tries to catch towards modeling the right tail by missing on the left tail model. Figure 10 illustrates the tail probability plots as function of for (8βdB). SY and LMM methods have larger errors in modeling the right tail. FW error increases with increasing .
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
4. Case Study: Leak-Down Time Comparisons
In this section, we extend the analysis to study the impact of the different modeling schemes, compared to the CDF matching method, on the probability of fail estimations. Figure 11 illustrates a summary of distribution approximations based on the relation between and in (2); except for the CDF matching method, is modeled as Gaussian random variable ~. The distributions are then used to analyze the leak-down of RBL in the circuit of Figure 12. The time for the bitline to discharge to 50% of the rail under extreme noisy corner voltage is then estimated.
(a)
(b)
(c)
Figures 13, 14, and 15 illustrate the normalized time-to-leak distributions for the case of 16βcells/bitline. represents a lower limit on the threshold voltage standard deviation of a 45βnm cell device; its value is set to the equivalent of (or 2.5βdB). This is conservative especially that as technology scales we expect more variability and additional sources of variability like the random telegraphic noise can add to the variation; cases for standard deviation of 1.3s0 and 1.6s0 are also plotted. SY, FW, and LMM methods overestimate the time-to-leak from 10% for s0 to close to 100% for 1.6s0 (see horizontal arrow in the figures; this corresponds to system yield around 4.5 sigma). More importantly, this leads to underestimating the probability of the number of elements failing at a given leak-time. Note that time-to-leak values can be critical relative to operating frequencies and accurate prediction is needed for robust designs. Thus, we are interested in computing the ratio of the probability of fails (vertical arrows in figures) for predicted time-to-leak values. Figure 16 summarizes the ratio of the true (cdf) probability of fail compared to that of the other methods (SY, FW, and LMM) at their 4.5 sigma yield leak-time. Each experiment is based on the average of 25 Γ 1 million replications. This is done for the case of 16 and 64βcells/bitline and at increments of . We note that the SY, FW, and LMM methods underestimate the probability of fail 10Γβ147Γ.
(a)
(b)
5. Conclusions
We study the ability of different sums of lognormal approximation to emulate the leakage of multiple leaky devices by a single equivalent device. With the goal of rare event estimation, tail distributions are examined closely. Modeling the tail probability by CDF matching is found to be critical compared to the Fenton-Wilkinson and Schwartz-Yeh methods that are found to underestimate the sum of lognorms and hence overestimate the fail probability by 10Γβ147Γ; this trend is expected to increase as the variability increases with scaling technology.