Abstract

This paper explores passive switched capacitor based RF receiver front ends for spectrum sensing. Wideband spectrum sensors remain the most challenging block in the software defined radio hardware design. The use of passive switched capacitors provides a very low power signal conditioning front end that enables parallel digitization and software control and cognitive capabilities in the digital domain. In this paper, existing architectures are reviewed followed by a discussion of high speed passive switched capacitor designs. A passive analog FFT front end design is presented as an example analog conditioning circuit. Design methodology, modeling, and optimization techniques are outlined. Measurements are presented demonstrating a 5 GHz broadband front end that consumes only 4 mW power.

1. Introduction

With the growth of the wireless industry, the spectral congestion caused by wireless user traffic has become a significant concern that threatens further growth of the technology [1, 2]. However, this congestion is a result of suboptimal frequency usage arising from the inflexibility of the spectrum licensing process. This inefficiency in spectrum allocation can be solved by allowing spectrum sharing using the concept of a cognitive radio (CR), an intelligent device that is able to dynamically adapt and negotiate wireless frequencies and communication protocols for efficient communications. For this, each participating device needs to have many capabilities such as determining location, analysing the external communications environment, sensing the spectrum used by its neighboring devices, dynamically changing the frequency and bandwidth of transmission, adjusting the output power level, and even altering transmission parameters and protocols [3].

Figure 1 provides an indication of the growth of cognitive radios as a research area in the recent past. The figure shows results for the number of publications with different keywords per year in the IEEE. Many of the keywords represent growing research areas in wireless, whereas other popular keywords such as “VLSI” and “DSP” have also been included for comparison. The first cognitive radio paper was published in 1999; however, research in this area was relatively dormant till 2004. Since then, with maturing technology and rising needs, cognitive radios have seen a tremendous growth in research activity and are now one of the most researched areas in wireless.

A cognitive radio can be structurally and functionally separated into (1) a software defined radio (SDR) unit that includes the hardware of the cognitive radio and (2) an intelligence unit, that provides the required software based intelligence (cognition) to the radio. In this paper, the SDR unit and, more specifically, the spectrum sensing receiver front end of the SDR will be discussed.

The driving force behind the cognitive radio concept has been the use of dynamic spectrum access [4, 5]. Dynamic spectrum access relies on dynamic spectrum monitoring using a spectrum sensor. Combined with spatial and temporal information, it can be used to perform dynamic spatial [6] or spatio-spectral beamforming [7] to exploit temporal, spatial, and spectral degrees of freedom. In this paper, we focus on the spectrum sensing aspect of the cognitive radio.

Among other features, this continuous monitoring of the spectral environment makes the cognitive radio unique in its hardware. From the hardware perspective, the spectrum sensor remains a challenging aspect of cognitive radio design. Even for narrowband (small frequency range, <100 MHz) spectrum sensors, limiting the power consumption is a challenge. The cognitive radio spectrum sensor needs to detect signals at all frequencies of interest instantaneously. In addition, very high detection sensitivity is desired (perhaps 100 times better than a conventional narrowband radio) to overcome the hidden-terminal problem, shadowing, channel fading, multipath, and so forth lest it causes interference to other users due to incorrect sensing [4].

In this paper, we demonstrate the suitability of passive switched capacitor signal processing techniques for spectrum sensing applications. We present various techniques in passive switched capacitors that allow them to be used in high speed, low power RF applications. As an example, we present a prototype passive charge based FFT design, first presented in [8], that can instantaneously analyze wideband signals (5 GHz bandwidth) with very low power consumption using these techniques. We present previously unpublished details on the FFT design methodology, architecture choice, and optimization techniques. We derive a linear time invariant (LTI) model of the system for use in system level designs. New measurement results are presented to corroborate the suitability of this design for spectrum sensing applications.

2. Review of SDR Spectrum Sensors

The architecture design for the SDR analog/RF is significantly different from that of traditional narrowband radios. In the original software radio proposal by Joseph Mitola in 1992, he envisioned an architecture that digitized the RF bandwidth (no downconversion) and performed spectrum analysis and demodulation in the digital domain. While providing the maximum amount of flexibility through increased software capability in the digital domain, this architecture imposes impractical requirements on the analog-to-digital and digital-to-analog converters. For example, as discussed in [9], a 12 GHz, 12-bit ADC that might be used for this purpose would dissipate 500 W of power! As a result, the ideal goal of being able to communicate at any desirable frequency, bandwidth, modulation, and data rate by simply digitizing the input and invoking the appropriate software remains far from realizable.

Subsequent proposals for spectrum sensing architectures can be divided into two fundamental categories: a scanner type and a wide bandwidth instantaneous digitizer type.

2.1. Scanner Architecture

In this scheme, a narrowband, wide-tuning receiver scans and digitizes the entire bandwidth (similar to a bench-top spectrum analyzer) for analysis. The digital back end processes each band sequentially and stitches the frequency domain outputs to obtain a spectral map of the environment. An example of architecture is shown in Figure 2. Note, however, that, in order to overcome issues such as multipath, fading, hidden nodes, and interference problems [4], the sensitivity and dynamic range requirements of the architecture are more challenging than a traditional communications receiver. Moreover, sensing may be a blind detection problem, as opposed to traditional reception where a priori knowledge of the transmitted signal is available.

Although the scanning architecture is able to reuse some features of a traditional receiver architecture, this detection technique suffers from multiple shortcomings. These systems lack the agility to be able to detect any fast-hopping signals. Frequency domain stitching is power hungry in the digital domain due to the need to correct phase distortion introduced by the analog filters. Moreover, stitching the frequency domain information from several scans is imperfect in the face of multipath; consequently, signals spanning across multiple scan bandwidths are imperfectly reconstructed. Due to these and other reasons, it is desirable to construct a real-time instantaneous bandwidth digitizer (similar to J. Mitola’s original software radio idea) in the spectrum sensor.

2.2. Wideband Digitizer Architecture

Unlike the scanning type architecture, a wideband instantaneous digitizer is expected to digitize the entire RF bandwidth simultaneously. Understandably, the wideband digitizer has widely been considered as the bottleneck to the realization of the SDR based cognitive radio. A number of efforts in recent years have focused on wider bandwidths, broadband matching, higher front end linearity, and, most importantly, wideband analog to digital converters.

Several architectures have been proposed for the RF front end. Of these, the most popular is the extension of the traditional receiver architecture as shown in Figure 3 effectively performing an RF to digital (R-to-D) conversion [10]. Typically, the front end also requires a wideband low noise amplifier (LNA) prior to the digitizer (not shown). Moreover the front end needs to handle a very large dynamic range due to the generally large peak-to-average power ratio (PAPR) of wideband signals. The increase in PAPR for wide bandwidths is described in Figure 4. As shown, the PAPR for the narrowband signals is only 2, while that for the wideband signal (5 times the bandwidth) with multiple signals, all having similar powers, is 10. As a result of the large PAPR of the wideband inputs, a very linear front end is required. The linearity requirements of the LNA have been addressed in [11]. Another approach using a low noise transconductance amplifier (LNTA) followed by mixers is discussed in [12]. Moreover, passive mixer-first topologies have been proposed for high performance [13].

The digitizer block shown in the figure is essentially an ADC with performance specifications beyond the capability of using state-of-the-art converters. This wideband digitizer can be implemented in multiple ways, all based on some form of multiplexing in order to ease the requirements on the ADCs. A multiplexed broadband approach using time interleaving can be utilized as shown in Figure 5 [10]. This scheme reduces the sampling rate of ADCs. However, all the ADCs still see the full bandwidth and, therefore, still require high dynamic range capability.

In order to reduce the dynamic range requirements on the ADCs, it is possible to transform the signal to a different domain prior to digitization. Specifically, a frequency domain transform is particularly attractive [10]. A frequency domain transform can be approximated in practice using band-pass filters for channelization. This reduces the dynamic range requirements on the ADCs but introduces the problem of designing impractically sharp band-pass filters. Replacing sharp band-pass filters by frequency downconverters followed by sharp low-pass filters eliminates this problem as shown in Figure 6 [10]. However, these are based on PLLs, mixers, and low-pass filters [14] or on injection locked oscillators [15] (note that injection locked oscillators have the advantage of a larger noise suppression bandwidth ( lock range) [16] and provide better reciprocal mixing robustness compared to PLLs (assuming that the reference phase noise is better than the VCO phase noise)), and can be power hungry. Moreover, harmonic mixing of signals within the SDR input bandwidth severely corrupts the channelized baseband signals. Additionally, due to overlap between bands and phase issues, signal reconstruction from the digitized filter bank outputs is challenging.

In this paper, we propose a digitizer approach based on analog signal processing using passive switched capacitors to condition the signal prior to digitization by ADCs (Figure 7). The RF discrete time (DT) signal processing, as shown in the second block in Figure 7, eases the dynamic range requirements on the ADCs by prefiltering the signal.

For RF sampled processors, an RF sampler has historically been an inherent bottleneck. However, with the scaling of technology and subsequent improvement in switch performance, RF sampling has become feasible in modern silicon processes. Moreover, it is possible to use charge domain sampling to leverage the inherent benefits of including of a built-in antialias filter into the sampler, robustness to jitter, and the ability to vary the resulting filter notches by simply varying the integration period. This use of RF samplers and subsequent discrete time processing provide a number of advantages in deep submicron CMOS processes [17]. Recently, other discrete time radio receivers using RF sampling have been demonstrated using CMOS technology for Bluetooth [18], GSM/GPRS [19], WLAN [20], and SDR type applications [9, 21].

3. Passive Analog Signal Processing

In this section, we show how signal sampling and variable-rate analog signal processing can be performed in the charge domain for spectrum sensing applications. Many of the benefits of the discrete time FFT architecture are based on the use of passive discrete time charge based computations. This is best illustrated with the help of an example design. The passive switched capacitor shown in Figure 8 is able to operate at RF sampling speeds [22].

In this circuit the input signal is sampled progressively in time ( ). After clock periods the averaged output is sampled onto the capacitor , which has previously been discharged. The complete circuit implements an -tap FIR filter that is decimated by . Interestingly, if the capacitor is not discharged between each rotation then the circuit implements an -tap FIR filter combined with a first-order IIR filter that is decimated by . Note there is no active element (i.e., amplifier) in this circuit. The circuit consists only of switches and capacitors, so the maximum sampling rate is only dependent on the settling times. Additionally, the only power dissipation, other than that required for sampling the signal from the input, is due to the charging and discharging of the switch-transistor gate capacitors in a very digital-like way. As a result, a variety of functions on the sampled signal can be computed very fast and using minimal power.

3.1. Passive Computations

For performing any linear function, addition and multiplication operations need to be performed. Note that all passive switched capacitor operations are destructive in nature. Therefore, once an operation is performed, the input values are lost. For performing multiple operations on a single input, multiple copies of the input need to be maintained. Here we present techniques to perform these operations using passive switched capacitor circuits. In order to select a suitable technique for implementation, it is necessary to compare these techniques based on their robustness to nonidealities, ease of implementation, power consumption, speed, and so forth.

3.1.1. Addition Operation

(1) Parallel Connection. Using passive switched capacitors, voltages may be added by sharing the charges on two participating capacitors by connecting them in parallel as shown in Figure 9. The result of this operation (for capacitors with equal capacitances) is the average value of the input voltages and , which is a scaled version of their sum operation. Also note that two copies of the output are obtained and these can be used for two subsequent independent operations as desired. However, the operation inherently attenuates the output by half. From an implementation perspective, use of parallel capacitors allows the sharing of one plate (ground plate) for all the capacitors. This can greatly reduce the parasitic capacitance and resistance of the capacitor and the area of the overall implementation.

(2) Series Connection. An alternative technique is to connect the capacitors in series. The result of this operation is the sum ( ) of the input voltages and . In this scheme, it is possible to use slightly delayed clock phases for the top and bottom plate switches in order to make the charge injection independent of the input voltage [23]. However, in the latter technique, switches are required both on the top and bottom plate, thereby increasing the power consumption in this circuit. The two switches placed in series halve the speed of this circuit for identical switch sizes. Moreover, only one output (which can be used for exactly one subsequent operation) is obtained. Also, both the top and bottom plate parasitics are problematic.

3.1.2. Multiplication

(1) Charge Stealing. Multiplication in the charge domain can be performed by scaling the voltage on a capacitor using a share operation with another known capacitor (stealing capacitor). The charge on the stealing capacitor is not utilized later on. The overall operation causes a subunity scaling on the original value. The scaling factor for a capacitor of value and a stealing capacitor of value is given by .

Figure 9(b) shows a scaling operation using a stealing capacitor of size with no initial voltage on it. After the sharing operation, the final value on the capacitor with initial value becomes . can be chosen appropriately to obtain a particular scaling factor. Note that, although this technique is capable of performing both subunity scaling and multiplication with a known attenuation, at least one of the operands needs to be known in advance for this implementation. In case voltage dependent variable capacitors (i.e., capacitor DACs) are utilized, dynamic operands can also be used.

(2) Pulse-Width Modulation (PWM). Another technique to perform multiplication using passive switched capacitors is to modulate the turn on time of the switch and perform an incomplete share operation with a fixed stealing capacitor. The duration of the operation determines the multiplication factor. It is possible to multiply two unknown operands using this technique. Unfortunately, considering the nonlinearity in the resistance and the share operation, the errors caused by this technique make it unusable. However, the concept can be used to devise another PWM scheme which allows complete settling, thereby making it more reliable. In this modified technique, the switch can be turned on using a sequence of randomly placed pulses and sharing the capacitor charge using a small stealing capacitor for each clock cycle. The stealing capacitor is discharged at the end of each cycle. Complete settling is allowed in each cycle. The total number of on-pulses determines the amount of scaling. Maximum scaling is obtained when all the clock cycles have on-pulses, while no scaling is obtained when all the clock cycles have off pulses. Although this technique is relatively accurate and is able to handle dynamic operands, it is slow and consumes more power than the charge stealing technique. Also, depending on the accuracy required, the attenuation is considerable.

(3) Current Domain. If the charge is converted to the current domain, a single variable-duration PWM scheme can be used to perform multiplication. Also, multiplication would not entail an inherent attenuation. However, the technique is very power hungry, and the accuracy of the transconductance amplifier that translates from charge to current domain needs to be very high.

Due to their low power, high speed characteristics, we have focused on the parallel connection scheme for addition and the charge stealing scheme for multiplications in our designs. For many relevant linear algebra problems, multiplication using fixed coefficients is sufficient, and this technique lends itself easily to such applications.

3.2. Switching Schemes

To implement these addition and multiplication schemes, a variety of switched capacitor topologies can be used. Note that complex multiplication can be performed using a combination of scalar multiplication operations as discussed in [24]. In this subsection, we discuss the various topologies and their trade-offs. For the addition operation, two capacitors can be shared as shown in Figure 9(a) and represented by Figure 10(a). We can combine a share followed by scaling into a single operation by connecting 3 capacitors (2 with input samples and 1 empty) and sharing their charges. This can be performed in different ways using 2 or 3 switches as shown in Figures 10(b)–10(d). It can be shown that 3 appropriately sized switches in the scheme of Figure 10(d) minimize the settling error [25]. Multiplication by a factor is a special case scaling operation that can be performed using a single step operation [25]. Depending on the normalization of the scaling factor, this may be performed using 4 capacitors (Figures 10(e)–10(h)) or using 5 capacitors (Figures 10(i)–10(l)). Moreover, in the case of four input operations (radix-4 operations), these schemes (Figures 10(e)–10(l)) are useful.

While many schemes (Figures 10(a)-10(b), 10(d), 10(e), 10(h), 10(i), and 10(l)) ensure settling symmetry, others (Figures 10(c), 10(f), 10(g), 10(j), and 10(k)) use fewer switches for lower power at the expense of settling performance and mismatch. Some variants (Figures 10(d), 10(h), and 10(l) with equal size switches) provide both settling speed and symmetry at the cost of larger power. When the switches between the operand capacitors are sized differently from those connecting to the stealing capacitor, in (d) and (l), these same configurations can be optimized for an enhanced settling-per-power performance. Finally, when comparing the different schemes, with their appropriate switch sizes, different trade-offs with regard to charge injection, clock feed-through error, and so forth should be considered.

For our design, we chose to use (a) and (d) to perform radix-2 scalar operations, while complex operations are performed by cascading to sets of operations. Configurations (h) and (l) were used to perform single-phase complex multiplication in special cases. In the case of (d) and (l), optimized switch sizing was used to mitigate their extra power demands while still realizing their enhanced settling performance for a net settling-per-power gain versus (b, c) and (i–k), respectively.

3.3. Nonidealities

Several nonidealities haunt passive switched capacitor circuits. The problem of nonidealities is aggravated by the absence of a virtual ground node unlike in op-amp based active switched capacitor circuits. The effect of sampling clock jitter in passive switched capacitor circuits has been analyzed [26]. Two important nonidealities, clock feed-through and charge injection, become a nuisance in the absence of a virtual ground node. Consequently, traditional circuit techniques such as bottom plate sampling are difficult to implement. Also, poor matching between nMOS and pMOS switches and the reducing difference between and in scaled technologies make the use of transmission gate switches less effective for mitigating these nonidealities. The noise in the system is dominated by the noise of the filter formed by the switch-capacitor combination. Moreover, for a multistage switched capacitor operation, the sampled noise voltages from one stage recombine in the later stages. These combining noise samples in a particular stage are correlated, and, therefore, the final noise becomes a complicated function of the noise sampled at each stage of the switched capacitor operation. The switch resistance (along with the capacitance of the capacitor) determines the settling time constant. However, the switch resistance is inherently nonlinear and input signal dependent. Consequently, in the case of high speeds of operation, incomplete settling can cause significant signal dependent errors in computations.

Since switched capacitor circuits utilize a clock signal, the accuracy of the clock is critical to performance. Specifically, jitter in clocks reduces the accuracy of the switched capacitor computations by translating timing uncertainty to charge and voltage uncertainty. Fortunately, new techniques based on transconductance linearization can be used to achieve low phase noise clocks in SiGe bipolar [27] and even in scaled CMOS circuits [28]. For increased frequency flexibility, highly optimized switched inductor [29] and switched capacitor [30] based LC VCOs can be utilized to obtain a wide range of frequencies without sacrificing noise performance. Moreover, on-chip self-healing techniques [31] utilizing a digital back end can be used for healing the switched capacitor circuits as well as improving the clock jitter [32].

For high speed designs, it is necessary to accurately model these nonidealities in the circuit simulator. It is also useful to have the ability to individually turn off these nonidealities to trace the effect of each nonideality on the output error. For our designs, we model the nonidealities in MATLAB and include them in system level simulations using MATLAB or Simulink [25]. This allows us to effectively capture the nonidealities and optimize the designs in their presence.

4. An Analog FFT-Based Front End

In this section, as an example of a passive switched capacitor spectrum sensing front end, we introduce a frequency domain divide and conquer approach that can enable wideband digitization. The architecture comprises an analog domain Fourier transform signal processor (see previous implementations by [33, 34]) that can be followed by multiple ADCs that digitize the input in the frequency domain. In our design, we utilize an RF sampler followed by an analog domain, discrete time, passive switched capacitor FFT engine to perform channelization of the wideband RF input. The circuits are based on the addition and multiplication techniques discussed and selected in Section 3. A description of the design of this charge reuse analog Fourier transform (CRAFT) was presented in [25]. In this paper, we use the CRAFT design as an example of a passive switched capacitor spectrum sensing front end, provide more details on the design methodology and optimization, and develop high level models for system simulations. Although the discussions here pertain to the CRAFT design, the underlying principles are general and can be easily extended to other passive switched capacitor front end circuits for similar high speed applications, including linear filtering and other transforms.

For spectrum sensing, we use CRAFT as a functionally equivalent linear phase -path filter (see Figure 11) to perform channelization [35]. This scheme reduces both the required speed and dynamic range of the ADCs and, by virtue of being minimal phase, allows for simple reconstruction in the digital domain using an IFFT without any loss of information.

For dynamic range calculations, signals are assumed to be distributed evenly in frequency. Breaking the input up into equal frequency channels reduces the PAPR as explained earlier in Figure 4. In general, an -path channelization of spread signals reduces the dynamic range by times (even in the more general case with multiple signals arbitrarily placed in frequency, this approximation typically holds), causing an times dynamic range reduction for the ADCs.

The DFT computation to be performed is a time-to-frequency transform defined as where is defined as . The desired 16-point DFT ( ) can also be represented as a linear matrix operation on a vector of length 16 given by where the scaling factor due to attenuation inherent in the charge domain operations is absorbed within . Expanding (2) for the length 16 case we get where are the DFT inputs , are outputs , and is the scaling factor. The equation can be further simplified by noting the symmetry and periodicity of the powers of . These properties are utilized to formulate the FFT algorithm as an efficient implementation to calculate the DFT outputs in operational complexity rather than the complexity of multiplication by .

Figure 12(a) shows the flowchart representation of the radix-2 decimation-in-time FFT algorithm used in CRAFT. As seen in Figure 12(a) and by its definition as a linear operation, the FFT uses only two types of operations: addition (and subtraction) and multiplication by fixed twiddle factors. The twiddle factors are shown as powers of in Figure 12(a), where are equally spaced points on the unit circle in the complex plane as shown in Figure 12(b). As a result, for every scaling factor , and . Since passive computations discussed above in Section 3 inherently attenuate the signal, these operations are suitable for subunity scaling.

The CRAFT design is implemented using a number of blocks shown in Figure 13. A brief description of the circuits utilized in the CRAFT design follows. The timing diagram for the various clock phases used to operate the system is shown in Figure 14.

4.1. RF Sampler

An RF nMOS switch based voltage sampler operating at 5 GS/s for both and paths effectively providing 10 GS/s was implemented. An array of 256 samplers was used for providing inputs to CRAFT as shown in Figure 13. The timing of the sampling clock phases is shown in Figure 14. The noise contribution of the sampler is given by . Therefore, a larger capacitor reduces the noise. However, increasing the size of the capacitor warrants a larger switch transistor to maintain the same sampling bandwidth, increasing the power consumption in the sampler. In CRAFT, the sampling capacitor was selected to be 200 fF so that the noise from the sampler was below  dBFS. The switch size was selected to allow sufficient settling such that the output-referred settling error is below  dBFS for 5 GS/s operation.

4.2. CRAFT Core Design

The CRAFT core follows the RF sampler and performs an FFT operation as shown in Figure 13. The CRAFT operation, represented in matrix form as shown in (2), can be further broken down into 4 share and 4 multiply operations in the 4 constituent stages leading to , where each of the 4 stages is denoted by and is the stage number. The matrices for each stage are detailed in the appendix. Each stage is implemented using parallel addition and charge stealing techniques outlined in Section 3. Switching schemes are selected to reduce power and improve settling time. Details of the design methodology, circuit design principles, and circuit optimization are discussed in Section 5. Using the optimized design methodologies, only 5 clock phases are used for the entire CRAFT processing operations. These phases have unequal durations to optimize settling and are shown in Figure 14. The total processing time is chosen to be equal to the sampling time in anticipation of an interleave-by-two implementation.

As shown in Figure 12(a), after each operation, half the wires return to their bus while the rest continues on the other buses. Note that the wires in the CRAFT core are permanently connected to the sampling capacitors and their parasitics directly add to the sampling capacitance. Therefore, to equalize the sampler wiring parasitics, the switches are always placed midway between two operand buses. Two example wires, one always returning to its own bus while the other always shifting onto the other operand bus, are highlighted in the layout screenshot in Figure 15. As seen, the two wire lengths (and their associated parasitics) are nominally matched.

4.3. Output Latch

On the far end of the core, CRAFT connects through switches to operational transconductance amplifier (OTA) based analog latches that store the outputs temporarily prior to being read out (Figure 13). The read-out rate is limited by the speed of the OTA as well as the external amplifiers and ADCs. The OTAs are based on a two-stage, folded cascode, differential architecture and provide 70 dB gain with a 900 MHz unity gain bandwidth (UGB). The OTA is utilized in a differential switched capacitor analog latch configuration. As shown in the timing diagram in Figure 14, the latch performs offset cancelation and OTA common mode feedback during the sampling and processing phases ( ) and latches the output with a 10 settling accuracy ( is the constant of the error settling which is ) during the next 32 clock phases ( ). The output is then held ( ) for the external measurement system to read out using an analog multiplexer. As shown in Figure 13, thirty-two latches capture the complex valued FFT output.

4.4. State Machines

The sampling array, CRAFT processing engine, and output latches require multiple clocks to operate and interface with test equipment. These clock phases are shown in Figure 14. The input clock is used to generate all internal signals. The input state machine (labeled state m/c 1 in Figure 13) is externally triggered to initiate a conversion. It generates 16 sampling clock phases followed by the processing clocks to operate the CRAFT core switches. A second state machine (labeled state m/c 2 in Figure 13) uses handshaking with the first and an external trigger to determine when CRAFT outputs are valid. It subsequently generates the clocks for the analog latch array to save the first CRAFT conversion after being triggered. The latched outputs are then observed sequentially using the integrated low-resistance analog multiplexer (16 × 4 to 1 × 4 for differential real and imaginary outputs from one FFT bin). This setup allows asynchronous operation between the conversion and latch triggering.

5. Design Methodology and Optimization

Within the CRAFT processing engine, computational speed, dynamic range, and operating power trade-off with each other. The analysis of design non idealities, discussed in Section 3.3, represents a complex design space with different trade-offs associated with each error source and the particular mitigation techniques utilized. This section outlines a design and optimization methodology used in the CRAFT design to achieve superior performance. For this implementation, the following specifications and constraints were assumed:(1)a 5 GS/s input rate ( and ) with an interleave-by-two CRAFT engine for processing contiguous windows (this provides a total processing time of ns);(2)a 60 dB (10 bit) dynamic range design goal.

Using these goals, the CRAFT engine is optimized for processing power. The design methodology is divided into an architecture choice based on the constraints listed, followed by an energy optimization procedure.

5.1. Design Parameters

To specify the architectural parameters, we initially assume the processing time to be shared equally among the 5 clock phases (unlike in Figure 14). This assumption is revisited during the energy optimization procedure described later on. Also, a nominal for all the stages is assumed as an initial choice and is optimized later on. Based on these assumptions, the following design choices are made.(1)Input Swing. The maximum input swing, , for the sampler is chosen to achieve −60 dBFS nonlinearity while running at 5 GS/s. This determines the peak-to-peak input swing to be used as the input full scale. For use with a nMOS switch based processing core the common mode voltage, , is set at .(2)Capacitor Size. The sampling capacitor size is selected such that the noise floor from the sampling operation is lower than required for the target SNDR of 60 dB. This dictates a sampling capacitor size of at least 200 fF.(3)Attenuation. Attenuation degrades FFT performance. Consequently, all techniques that mitigate attenuation are incorporated for improved performance.(4)Dummy Switches. The effect of clock feed-through and charge injection for each stage on the overall SNDR is simulated. Dummy switches are selected for stages where the overall SNDR is otherwise not met.(5)Additional Settling Switches. The overall computational settling error is simulated, and additional settling switches are used for stages where their effect is overtly beneficial to SNDR performance.(6)Sampler Switch Size. The minimum switch size that provides adequate sampler settling and nonlinearity for the required SNDR is determined.(7)Settling. The minimum per-stage computation settling accuracy required for the overall SNDR is selected. For CRAFT, the following amounts of nominal computational settling were chosen for stages 1–4: 7 , 4 , 5 , and 4 , respectively.

5.2. Energy Optimization

The exact switch sizing in each stage, as well as the employed, trade-off with the total energy consumption per processing operation. The energy optimization algorithm is outlined below.

5.2.1. Supply Voltage

In short channel devices velocity saturation affects nMOS switches. The triode resistance of a switch in deep triode ( ) varies proportionally as below and is empirically fit as shown: For the devices in CRAFT, provided an accurate empirical fit. Using this approximation, the switch resistance is . For a constant , . In order to calculate the energy per switch operation, we compute . Using these equations, we compute the energy per switch operation for a constant switch “on” resistance: This is plotted in Figure 16. As seen, this curve has a unique minimum energy that occurs at . Naturally, this minimum coincides with the typical supply voltage in this technology to optimize digital energy per speed (e.g., consider this nMOS switch as part of an inverter). This optimum is then customized per stage depending on the varying operand voltage swings as a result of attenuation. Note that for the optimization described, it is assumed that 2 different supply domains are available for optimization to cover the general case. The optimization algorithm can be easily modified for the specific case using single or multiple voltage domains, based on availability. Additionally, if alternate sample rates or time allocations may be used, different settings allow separate optimization for those modes as implemented switch sizes remain fixed.

5.2.2. Switch Size

For calculating the optimal switch width, note that each corresponds to a particular switch width (for a given resistance) on the constant-resistance plot. The maximum allowable nominal resistance can be calculated based on the required settling and allocated time chosen in Section 5.1. Therefore, from the chosen above and the maximum allowable resistance, the optimal switch width for each stage is calculated.

5.2.3. Time Allocation

The energy per stage is dependent on the required switch resistance and, consequently, the time allocated per stage. The per-stage time allocated is now considered as the last optimization variable and is redistributed (instead of the equal distribution assumed earlier) to optimize the total energy further. For the new allocated times (as shown in Figure 14), new optimal switch widths are determined.

6. LTI Model

For the purposes of system level simulations to test communication link performance, it is desirable to model the system using a simple but accurate linear model. For this purpose, we have devised an LTI approximation of the entire CRAFT-ADC digitization operation, along with the twiddle factor inaccuracies and noise. Noise disturbances as well as mismatch based nonidealities and digital correction are also included in the LTI model for performance evaluation. The simulated SNDR performance using the LTI model matches the performance obtained through circuit simulations. This makes the LTI model suitable for fast and reliable system level simulations without the need for circuit level modeling. It also allows the simulation of the CRAFT circuitry for a variety of other architectures and applications.

Note that the measurement results (discussed later in Section 7) include the nonidealities of the 8-bit resolution arbitrary waveform generator (AWG) inputs and the output test equipment (these are among the state-of-the-art test equipment available for measuring an RF front end signal processing DUT) that severely limit the observable nonidealities in the CRAFT circuitry. The additional nonidealities due to the test setup are not part of the model (or circuit simulation), so that the model predicts a somewhat better performance (roughly 10 dB) than is measured.

A brief description of the components shown in Figure 17 is tabulated in Table 1. Note that nonlinear effects such as settling error, charge injection and absorption, and clock feed-through have not been included in this model. As discussed earlier, the 16-point FFT ( ) is implemented as , where is the inherent scaling due to charge based operations.

6.1. Systematic Twiddle Factor Error

The linear errors in the FFT matrix due to systematic capacitor mismatch can be represented using matrix instead of the ideal matrix , producing at the output, where

Expanding each nonideal stage ( ) as an ideal stage ( ) plus a systematic error ( ), where

6.2. Noise

Given that the DFT is a linear operation, the noise in the system can be analyzed using linear superposition of the individual noise sources: noise in the sampler, noise per processing stage, and ADC noise, as shown below.

6.2.1. Sampler Noise

Consider the following:

Note that the outputs of the CRAFT operation are interpreted in the frequency domain. Therefore, any inequality in the gains to the individual outputs will cause the output noise to be colored. This results in (10), where the sampler noise appears at the FFT output as expected white noise terms plus additional colored noise terms due to the unequal gains of to the different bin outputs. Consider the following:

6.2.2. Processing Noise

Each stage of processing adds noise in the charge domain and can be expanded as shown: where

6.2.3. ADC Noise

The ADC quantizers add noise ( ) to the outputs of the CRAFT operation and are interpreted as frequency domain noise perturbations, given by In case all the ADCs have equal gains and the white quantization noise approximation holds, can be approximated to be white.

6.2.4. All Superposed Noise Sources

Assuming that the noise terms are independent, the total error can be expressed as

6.3. Digital Correction

Next we consider a linear correction step that can be implemented in the digital domain. The correction matrix is given by , where is the estimated set of independent input vectors giving uncorrected output responses . Note that and are matrices comprising vectors of size . The output, , after digital correction, is given by where

This assumes that the correction uses an accurate estimate of the implementation, as represented by the relationships and .

If the implementation has small error with regard to the ideal transform (so that ), the approximation below shows that the processing noise is ideally reduced so that the additive white noise model holds:

In summary, (15) shows that, with small error, digital correction correctly leads to an estimate, , of the desired ideal noiseless transform . Additionally, the colored sampling and processing noise terms are restored to their ideal output-referred values: and . Quantization noise, , is slightly modified after correction. Rather than being completely independent between output bins, the correction matrix causes some weighted combining of outputs. However, since the implementation is designed to match the desired ideal transform to many bits of accuracy, this effect is minimal.

6.4. Application of the Model

In order to incorporate the nonidealities arising from CRAFT as well as from the rest of the signal path, the appropriate error terms need to be computed and input to the LTI model. For the signal independent computation errors, such as those from uncompensated parasitics or mismatched capacitors, the matrices can be appropriately modified (i.e., ). This information is directly available from simulations, chip measurement, or foundry data and can be incorporated as the bit resolution of the matching and the resolution of the parasitic estimation.

In order to model signal dependent nonidealities, the error terms are computed as a function of the input amplitude. For example, for nonlinear incomplete settling errors, the error is proportional to the input amplitude (by the factor ), and the individual errors are made input signal dependent. Note that, in reality, each addition and multiplication operation is dependent on its specific inputs. In our model, we effectively average out this dependence linearly across all computations such that the resulting error is an approximate function of the input amplitude distribution. For example, Monte Carlo simulations for the effective settling in a 2-point share operation with uniformly distributed input amplitudes are shown in Figure 18. A histogram of the effective settling time is shown on the top, while a histogram of the resulting error with respect to the full-scale signal is shown at the bottom. In this case, the average setting time is while the average error is  dBFS. The nonlinear settling error can therefore be averaged and incorporated into the LTI system. Note that this average is a strong function of the input amplitude distribution and should be recalculated for the appropriate distribution.

7. Measurement Results

In this section, we present some additional previously unpublished measurements of the CRAFT engine to demonstrate its spectrum sensing capabilities. The test setup for measuring the CRAFT system is shown in Figure 19. As shown in the figure, and inputs from a Tektronix AWG-7122B arbitrary waveform generator are input to the CRAFT sampler. The latched outputs are externally buffered and digitized by external ADCs controlled by an FPGA (NI-7811R) programmed using LabVIEW.

The time-domain input and output characteristics, as observed by the oscilloscope (Agilent DSO7104B, see Figure 19), for a single frequency sinusoidal signal on the first bin are shown in Figure 20. A combination of sine and cosine signals is used to obtain an input signal. For this measurement, the system is set up such that the input signal is sampled using a progressively shifting phase ( ) upon every FFT conversion causing the bin 1 output to rotate periodically as shown in the figure. Note that this differential peak-to-peak measurement allows us to cancel the fixed DC offset and directly provides the on-bin output magnitude of the CRAFT operation.

For a signal with frequency exactly aligned with the first DFT bin, the output is expected only at the first bin output for a rectangular window. All other bin outputs are expected to be zero [36]. However, due to nonidealities in the CRAFT operation, outlined earlier in Section 3.3, the other bin outputs contain leaked outputs that rotate similarly over time.

To measure the FFT performance across bins with high resolution, the outputs were digitized using off-chip ADCs and recorded by an FPGA (NI-7811R) programmed using LabVIEW. In order to reduce the output noise so as to observe the design nonlinearities, a large number of outputs were recorded and averaged. Offline calibration was used to cancel the static error due to parasitics as discussed in Section 6.3. For a single-tone input of varying amplitude at a frequency corresponding to bin 1, the outputs across all 16 bins are shown in Figure 21.

As shown in the figure, for low input amplitudes (thin, blue curves), bin 1 shows an output amplitude proportional to the input signal as expected. Noise appears on other bins in a random way as shown in the blue curves. As the amplitude increases (thick, red curves), we see that the bin 1 output increases as expected. However, the leakage onto the other bins now follows a particular nonrandom pattern that also rises with the rising input. This pattern signifies nonlinearity, as opposed to noise in the other bins at lower input amplitudes. It is also noticeable that this leakage rises faster than the rise of the bin 1 output amplitude (in dB scale). This is expected since the higher-order harmonics due to the sampler nonlinearity increase faster with an increase in input amplitude (as the th power of the input amplitude for the th harmonic) as compared to the first harmonic.

Table 2 tabulates the performance of CRAFT for 1-tone inputs in different bins for three representative speeds: 1 GS/s, 3 GS/s, and 5 GS/s. In the table, SFDR for a 1-tone test is calculated as the difference between a full-scale on-bin signal and the largest off-bin output from CRAFT nonlinearity/calibration errors. SNDR is calculated as follows: As shown in the table, an average SNDR of about 50 dB is obtained at 1 GS/s and 3 GS/s, while, at 5 GS/s, an SNDR of 47 dB is obtained. This achieves an 8-bit resolution spectrum detection across a 5 GHz bandwidth.

The four stages in the CRAFT processing engine have two power supplies: one for stages 1 and 2, , and one for stages 3 and 4, . This separation allows us to potentially optimize the supplies independently for power. Figure 22 shows the measured variation of energy consumed per conversion with the supply voltages. This trend is expected from the square dependence of energy on supply voltage: for dynamic digital power. Supply voltages of and were used as nominal supplies as marked by the black bold line in the figure. The energy consumption at this nominal supply voltage is only 12.2 pJ/conversion translating into a power consumption of 3.8 mW at 5 GS/s operation.

Figure 23 plots the measured output SNDR versus the varying supply voltages for the different stages. The nominal supply voltages are marked using black bold lines, and their intersection (nominal operating point) is labeled. As can be seen from the plot, for larger voltages, a high output SNDR is obtained. Higher supply voltages ensure that the switch “on” resistance is low allowing a higher settling accuracy. As expected, lowering the supply voltages reduces , in turn increasing switch “on” resistance and lowering settling accuracy.

The impact of processing switch supply voltage on SNDR is dependent on the signal swing at intermediate processing stages and the effects it has on the settling variation, as well as the differing severity certain computation errors have upon the final result. Also, as labeled in the figure, power can be optimized by lowering the voltage till the SNDR performance is at the edge of the waterfall. This corner corresponds to a power optimized supply for this design with a 37% reduction in energy consumption, while the SNDR is degraded by 3.8 dB compared to the nominal design point.

8. Conclusion

This paper discusses the use of passive switched capacitor circuits to design the RF front end for spectrum sensing in cognitive radios. Switched capacitor techniques suitable for wideband RF operation were presented. An example architecture based on a passive switched capacitor FFT front end was described. Design choices, methodology, and optimization were discussed followed by system modeling for high level simulations. Measurement results are presented to prove the efficacy of the design solution.

Appendix

CRAFT Matrices

The linear operation performed by the DFT is written as in (2). When the computation is performed in a stage-wise manner, as is done by an FFT (radix-2, 16-point), it can be decomposed into a sequence of operations as shown below: These four stages, through , are shown below and are represented graphically in Figure 12(a). is an identity matrix modified to perform bit-reverse ordering of the input vector, , for the decimation-in-time (DIT) algorithm. Consider the following:

The radix-2, 16-point, DIT FFT is implemented by CRAFT as a cascade of four stages of in-place processing operations: where was shown previously and , , , and are shown below. They differ from the FFT matrices due to the attenuation, charge-averaging, and stage scaling factor effects of the implementation. They are rewritten below in a manner that matches the implementation: where

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was partially supported by DARPA and Center for Circuit and System Solutions (C2S2). The authors are grateful to the members of the UMN analog design lab for discussions and tape-out help.