Abstract

High-speed links which employ source synchronous clocking architectures have the ability to track correlated jitter between clock and data channels up to high frequencies. However, system timing margins are degraded by channel skew between clock and data signals and high-frequency loss. This paper describes how these key channel effects impact the jitter performance and influence the clocking architecture of high-speed source synchronous links. Tradeoffs in complexity and jitter tracking performance of common per-channel de-skew circuits are discussed, along with how band-pass filtering can be leveraged to provide additional jitter filtering at the receiver. Jitter tolerance analysis for a 10 Gb/s system shows that a near all-pass delay-locked loop (DLL) and phase-interpolator- (PI-) based de-skew performs best under low skew conditions, while, at high skew, architectures which leverage band-pass clock filtering or a phase-locked loop (PLL) for increased jitter filtering are more suitable. De-skew based on injection-locked oscillators (ILOs) offer a reduced complexity design and competitive jitter tolerance over a wide skew range.

1. Introduction

Interface architectures which allow for high data rates at improved power efficiency levels are required to satisfy the growing I/O bandwidth in power-constrained environments ranging from data centers [1] to mobile systems [2]. Links which leverage source synchronous clocking, such as high-bandwidth multichannel parallel connections from processor to processor or memory chips (Figure 1) [3, 4], have potential to achieve these objectives due to their wide bandwidth jitter tracking and reduced clock circuitry complexity relative to embedded clock systems [5].

One of the major factors limiting the maximum achievable I/O data rates occurs from the degradation of system timing margins by clock jitter. In a multichannel source synchronous system, jitter can be decomposed into sources which are correlated or common among the clock and data links, such as phase-locked loop (PLL) and supply-noise jitter, and uncorrelated sources, such as driver random and intersymbol-interference-(ISI-) induced jitter. A key advantage of source synchronous systems is that correlated jitter can be tracked over an extremely wide bandwidth, as the clock which synchronizes the transfer of data onto the channels is also forwarded to the receiver to perform the data sampling operation. Thus, only uncorrelated jitter and jitter at frequencies beyond this high tracking bandwidth degrade the data capture process.

Delay variations between the clock and data signal paths, which occur due to circuit board trace mismatches, forwarded clock buffer/regeneration delays, and multichannel clock distribution, have a major impact on link performance. While matched delay elements in the data path have been implemented to mitigate this clock/data skew in lower-speed memory interfaces [6], implementing this at very high speed conflicts with the paramount objective of improving I/O power efficiency. Ultimately, clock-to-data skew can approach the ns-range and places a limit on the maximum jitter frequency that the receiver should track for optimal timing margins. Jitter amplification of the forwarded clock over the low-pass channel is another important effect which influences the link architecture.

This paper presents an analysis of key channel effects and how different receiver clock de-skew structures impact the jitter performance of high-speed source synchronous links. Section 2 gives an overview of source synchronous link architectures and explains the effects of clock and data skew and channel loss on system jitter. The operation and jitter tracking properties of different receiver de-skew circuits are discussed in Section 3. Section 4 details how applying band-pass filtering to the forwarded clock impacts clock jitter performance. A comparison of clock de-skew architectures’ jitter tracking performance for different channel conditions is made in Section 5. Finally, Section 6 concludes the paper.

2.1. Source Synchronous Architecture

Source synchronous link architectures (Figure 1) employ an extra channel to transmit the clock signal from transmitter to receiver chip for data sampling. In order to maximize the jitter correlation between clock and data paths, a replica transmitter with the same power-supply jitter sensitivity as the data transmitter drives a clock pattern onto the clock channel. Since the low-pass channel will attenuate the clock signal, a receiver clock amplifier is often used to compensate the channel filtering and drive the clock signal over the on-chip distribution network. Clock de-skew circuits adjust the sampling clock phase at each receiver channel independently to maximize link timing margins. In sampled data receivers, the de-skew circuits introduce a 0.5 UI spacing to align the forwarded-clock signal near the center of the data pulses. While for integrating receiver frontends, the forwarded clock signal is aligned near the ends of the data pulses [7].

2.2. Clock Skew Impact on Jitter

In source synchronous systems, emphasis is placed on matching the data and clock path circuit design in order to ensure similar supply noise sensitivity and maximize jitter correlation. As the clock which synchronizes the data transfer onto the channels is also forwarded to the receiver to perform the data sampling operation, this ideally results in zero differential jitter during sampling for correlated jitter with frequency content up to the clock de-skew circuits’ jitter tracking bandwidth (JTB).

However, delay mismatch in the correlated clock and data jitter due to skew degrades system timing margins. This is important because it is difficult to match the trace lengths of the clock and all data signals in practical systems, resulting in different signal propagation delays. Moreover, circuit mismatches in the clock and data paths introduce additional skew.

Consider a data jitter sequence given by where is the jitter amplitude and is the jitter frequency. Neglecting jitter filtering from receiver de-skew circuitry and channel jitter amplification effects, clock jitter at the sampler is expressed as where is the bit period and is the skew in bit periods. Differential jitter at the receiver samplers is given by If no skew exists between the clock and data signals, and , which implies that the system provides ideal data and clock jitter tracking. However, skew-induced phase shift between the jitter terms results in nonzero differential jitter, as shown in the example of Figure 2 for a 10 Gb/s system with 5 UI (500 ps) skew and a 100 MHz correlated jitter component. Here a peak differential jitter of approximately 0.16 UI results from 0.5 UI jitter amplitude. Figure 3 shows that as jitter frequency increases from 100 to 333 MHz, resulting in a larger phase shift for the 500 ps skew value, differential jitter also increases. This increased differential jitter will ultimately degrade system bit-error rate.

Figures 4(a) and 4(b) illustrate the relationship between skew, jitter frequency, and normalized differential jitter, , using the frequency domain transformation of a system with a skew of : Note that this expression neglects jitter filtering from receiver de-skew circuitry, which will be introduced later in Section 5. Figure 4(a) shows that as jitter frequency increases from low to moderate frequencies, a larger phase shift develops and differential jitter increases. A steeper increase in differential jitter is observed at lower frequencies as the skew is increased. While there may be a small set of dominant jitter frequencies in a multichannel system, the performance impact will differ due to variations in the per-channel clock-to-data skew. Figure 4(b) shows that, for a given jitter frequency, the normalized differential jitter increases with clock-to-data skew. The minimum frequency for which the differential jitter amplitude is equal to the input jitter is inversely proportional to the clock-to-data skew: This is also shown in Figure 3, where for a skew of 500 ps, the 333 MHz differential jitter amplitude is equal to the 0.5 UI input jitter amplitude. The differential jitter is amplified for moderate jitter frequencies above this value.

The differential jitter is periodic with a frequency of , peaking at a value of twice the input jitter amplitude at a frequency of when the jitter is 180° out of phase and reducing for higher frequencies. However, there is generally little correlation between jitter components at these higher frequencies. This increased differential jitter with frequency implies that an all-pass jitter tracking response is not optimal if clock skew exists in the system and that implementing receiver de-skew circuits with jitter filtering could provide performance benefits.

2.3. Channel Clock Jitter Amplification

High-frequency channel loss (Figure 5) also impacts the jitter performance of source synchronous links, as the low-pass channel response causes input jitter to be amplified in a high-pass manner [8, 9]. In order to investigate channel clock jitter amplification, consider the following time domain expression for a clock signal at frequency that contains a sinusoidal jitter component with amplitude and frequency : which, for small values, can be expressed in the frequency domain by where [8]. The main clock component and the jitter sidebands experience different channel scaling factors, resulting in the following received signal where , and are the channel response at , , and , respectively. Transforming this filtered signal back to the time domain results in Thus, the ratio of received jitter to transmitted jitter can be approximated as which implies the potential for received jitter amplification for typical low-pass channels.

To validate this for the 4 backplane channels of Figure 5(a), the jitter impulse response [5, 1012] is generated by extracting the channel output jitter pattern from a 5 GHz clock input with a 1 ps impulse applied to a rising edge. The channels’ jitter transfer functions are obtained by performing a DFT on this output jitter pattern and are shown in Figure 5(b) plotted up to the clock frequency to capture duty cycle distortion (DCD) jitter. As predicted by (10), the jitter amplification factor is highest in channels with the most severe frequency-dependent loss. This further motivates the use of receiver de-skew circuits that provide jitter filtering which, in addition to filtering uncorrelated high-frequency jitter, also mitigate the impact of this clock jitter amplification.

System designers will often choose to forward a lower frequency clock in an attempt to reduce jitter amplification. However, the clock jitter amplification is not so much determined by the absolute loss at the clock frequency, but rather the slope of the loss near the clock frequency. Note that while the 5 GHz loss is similar in channels 2 and 4, the jitter amplification factor is much higher in channel 4 with the 7 GHz resonant null versus the smooth loss channel 2. As shown in Figure 5(c), forwarding a 2.5 GHz clock over channel 4 provides much less jitter amplification due to the relatively shallow loss slope near 2.5 GHz versus the steep loss slope at 5 GHz due to the resonant null. However, for channel 2 which has a relatively uniform loss slope, the jitter amplification is similar for both 2.5 GHz and 5 GHz forwarded clocks. Thus, channel loss characteristics should be carefully considered in the decision to send a lower clock frequency, as this does not always mean a reduction in jitter amplification.

3. Jitter Tracking in Different Receiver Architectures

As alluded to in the previous section, the impact of channel skew and loss on system jitter performance has an influence on the desired receiver jitter filtering properties. This section discusses the operation and jitter filtering or tracking properties of common receiver de-skew circuits.

3.1. DLL-Phase Interpolator De-Skew

Figure 6 shows a receiver de-skew architecture which utilizes a delay-locked loop (DLL) followed by a phase interpolator (PI). The DLL feedback system generates uniformly spaced clock phases by passing the input clock through a multicell delay line set to typically be one or one-half the input clock cycle. High-resolution mixing is then performed by the PI with a pair of these coarselyspaced clock phases in order to generate the optimal sampling clock position.

As the clock passes directly through the DLL delay line and is simply phase shifted by the PI, ideally this DLL-PI de-skew system displays an all-pass jitter transfer function. However, the delay induced by the delay line in the DLL feedback system introduces frequency peaking.

In order to investigate this peaking behavior, consider the DLL -domain model shown in Figure 7 [13]. The DLL jitter transfer is where is the charge pump current, is the loop filter capacitor, is the sampling period, and is the delay line gain.

The frequency peaking observed in the 5 GHz DLL jitter transfer function of Figure 8 results in undesired amplification of high-frequency input jitter and degradation of system timing margins. Introducing an additional high-frequency pole, , within the DLL can reduce this high-frequency jitter amplification. A common example of this additional pole is powering the delay line with a linear regulator which introduces extra filtering in the loop [14]. With this additional pole, the overall DLL loop filter response is modified to and the DLL jitter transfer becomes As observed in Figure 8, introducing an additional 250 MHz pole reduces the high-frequency jitter amplification. In order to compensate for the residual frequency peaking, it is possible to cascade an injection-locked oscillator after the DLL [13] or leverage band-pass filtering of the clock signal [15] prior to the DLL for additional filtering.

The power supply noise performance of the de-skew circuitry is also an important design consideration in these multichannel source synchronous links, as switching noise from multiple transmitters, receivers, and core logic can couple into the receiver clock de-skew circuitry. As a DLL exhibits a high-pass response to noise coupled into the power supply of the delay line [16], the DLL high-pass bandwidth, which is set by the pole in (11), should be increased to minimize the impact of power supply noise. However, a tradeoff exists in the DLL-PI architecture between power supply noise filtering and peaking in the jitter transfer function, as increasing the DLL pole frequency results in increased peaking. Thus, the DLL pole frequency location should be set to balance these two system design considerations.

3.2. PLL-Phase Interpolator De-Skew

Figure 9 shows a receiver de-skew architecture which utilizes a phase-locked loop (PLL) followed by a PI. Similar to the DLL-PI de-skew, the PLL generates uniformly spaced clock phases with a voltage-controlled oscillator (VCO) which is phase-locked to the input clock. In addition, the PLL can provide frequency multiplication of a lower-frequency forwarded clock.

The overall jitter transfer function is the PLL phase transfer function, as ideally the PI only bypasses the PLL output signal jitter to the sampling clock. Utilizing a common series - filter and neglecting any secondary parallel filtering cap, , the PLL jitter transfer function is where is the charge pump current and is the VCO gain. This expression can be rewritten as where

Setting the damping factor, , too low in the second-order PLL jitter transfer function will result in peaking that amplifies jitter on the forwarded clock. As shown in Figure 10, this peaking exceeds 1 dB for damping factors less than 1.2. While this peaking can be reduced by increasing the damping factor further, there is the potential for instability and additional frequency peaking if the damping factor is increased excessively due to the secondary pole introduced by the extra filter capacitor. A PLL damping factor of is assumed for the remainder of this paper.

If increased jitter filtering is desired due to channel skew or loss effects, PLL bandwidth can be lowered by reducing charge pump current or increasing filter capacitance. However, excessive reduction in loop bandwidth increases both PLL settling time, which is a problem for low-power systems that require fast wakeup from power-down modes [17], and VCO accumulated jitter, which will degrade timing margins:

At the PLL output, VCO phase noise exhibits a high-pass transfer function. The accumulated VCO jitter is a random jitter (RJ) component that has to be considered in the link timing budget. In the time domain, VCO random jitter will accumulate up to a time inversely proportional to the PLL bandwidth. The variance of the VCO random jitter is calculated from the VCO phase noise profile and the PLL transfer function by [18] where

Figure 11 plots calculated values for the VCO phase noise profile of Figure 14 and verifies that VCO accumulated jitter reduces as PLL loop bandwidth is increased. Thus, in setting PLL loop bandwidth, system designers must balance the tradeoff between filtering input forwarded clock jitter and VCO accumulated jitter.

PLLs are also susceptible to power supply noise, especially the noise coupled through the VCO supply. As a PLL exhibits a band-pass response to noise coupled into the VCO power supply [18], the PLL bandwidth should be reduced to minimize the impact of power supply noise. However, this will come at the expense of reducing the jitter tracking bandwidth and also VCO random jitter accumulation.

3.3. ILO De-Skew

Relative to DLL or PLL-PI architectures, a simpler approach involves utilizing an injection-locked oscillator (ILO) to obtain the required per-channel de-skew, as shown in Figure 12. Under injection lock, the oscillator runs at the same frequency of the injected clock signal, but with an output phase shift of that is a function of the relative injection clock signal strength, , and the difference between the oscillator’s free-running frequency, , and the injected clock frequency, . Derived in [19, 20], the output phase shift can be expressed as where varies according to the oscillator topology,

LC oscillator: ring oscillator: and is the LC oscillator tank quality factor and is the number of delay stages in the ring oscillator. Theoretically, ILOs are only capable of achieving a phase de-skew range of ±90°, which is the minimum required phase shift for two clock phases in a half-rate receiver architecture. However, the injection locking is weak at this extreme phase shift. In order to make the system more robust and provide additional phase shift, additional weighted phase inversions of the injected signal can be employed [21].

An ILO provides first-order low-pass jitter filtering on the incoming clock signal where is the ILO jitter tracking bandwidth. Tuning the output phase shift by adjusting the oscillator free-running frequency and injection strength will also impact the ILO jitter tracking bandwidth:

Assuming a 4-stage ring oscillator with a 5 GHz free-running frequency and a 6.5 MHz minimum frequency step, the jitter tracking bandwidth and phase shift are plotted in Figure 13 for various injection strengths. A maximum jitter tracking bandwidth is obtained with zero phase shift, with a bandwidth degradation of less than 10% for de-skew settings within ±36°. However, the bandwidth falls off sharply as de-skew settings approach the ±90° theoretical maximum phase shift.

ILO jitter tracking implies that the output phase noise can actually be superior to the inherent oscillator phase noise, provided that the injection signal has lower phase noise, as is often the case where an LC-PLL is used at the transmitter chip to generate the forwarded clock and ring oscillators at the receiver serve as per-channel ILOs. If is the injected clock phase noise and is the de-skew oscillator phase noise, then the output phase noise, , at a given frequency, , can be expressed in terms of frequency offset or de-skewed output phase shift [21]: Using this expression, output phase noise is plotted for several de-skew settings in Figure 14. As the free-running frequency offset is tuned higher to generate a larger phase shift, the output phase noise deviates from the injection phase noise and begins to track the free-running oscillator phase noise at lower frequencies.

ILO accumulated jitter, obtained by integrating the ILO output phase noise of Figure 14 with (19), is shown as the de-skew phase is varied in Figure 15. Higher output jitter is observed as the de-skew phase is increased due to more of the free-running oscillator phase noise spectrum being integrated.

An increase in injection strength allows for a higher jitter tracking bandwidth and allows the output phase noise spectrum to track the injection phase noise over a larger frequency range, resulting in a reduced amount of accumulated jitter for a given de-skew setting. Note that near the edge of the de-skew range the accumulated jitter rises sharply, which would dramatically degrade receiver timing margins. This motivates the use of quarter-rate receiver architectures [22] which only require phase de-skew of four ILO clock phases over a range of ±45° to cover the necessary ±0.5 UI tuning. If the phase de-skew is limited to a maximum ±45° de-skew range, the ILO jitter tracking bandwidth is not degraded significantly and the oscillator accumulated jitter is significantly reduced.

ILOs are also sensitive to any noise coupled from their power supply, with the severity dependent on the specific oscillator topology [23]. In setting ILO design parameters such as LC oscillator and ring oscillator , designers should balance these parameters’ impact on supply noise sensitivity and jitter tracking bandwidth.

The use of band-pass filters can also provide jitter filtering, as an alternative to the de-skew circuits in the previous section. In a forwarded clock system, band-pass filtering can be leveraged to provide jitter filtering in a DLL de-skew system or decouple the dependency of jitter filtering with VCO jitter accumulation present in a PLL or ILO system. Band-pass filtering has been implemented in the receiver input clock amplifier by replacing the common differential resistive load with an LC tank designed to center the filter at the forwarded clock frequency [15]. Inductive termination has also been used to resonate at the clock frequency with the capacitance of a multichannel distribution network [24], resulting in a band-pass response that both provides jitter filtering and reduced clock distribution power by increasing the effective distribution impedance.

In order to investigate the jitter filtering offered by band-pass filters, we consider the expression stated earlier in (10). For a band-pass filter properly centered at the input clock frequency (Figure 16), . Thus, and the jitter of the transmitted clock has been reduced by band-pass filtering.

For up to moderate jitter frequency offsets, the band-pass function can be approximated as a low-pass function with respect to frequencies offset from : where and is the bandwidth of a band-pass filter with quality factor, . The band-pass filter’s fitter transfer function is approximated by

Figure 17 shows that jitter filtering increases with value. At a 5 GHz center frequency, a of 3 yields a jitter tracking bandwidth near 800 MHz. This value can be realized with a passive-inductor-based band-pass filter [15]. Large-signal simulations with an active-inductor band-pass filter [25] show that a of 30 is possible, which would yield a jitter tracking bandwidth near 80 MHz. Band-pass filters which allow for tuning [25] provide the potential for an adjustable jitter tracking bandwidth that can be set independent of the de-skew position and also avoid the jitter accumulation present in a PLL or ILO system.

5. Comparison of Source Synchronous Clocking Architectures

The previous sections discussed the jitter transfer characteristics of the channel and of different receiver blocks that a forwarded clock could encounter. This section examines how jitter tracking bandwidth impacts system differential jitter and compares the jitter tolerance performance of different source synchronous clock architectures for various channel skew conditions.

To understand how jitter tracking bandwidth impacts receiver performance, the system model of Figure 18 is used. Including the receiver circuitry jitter transfer function , the differential jitter seen at the sampler is

In the case of DLL-PI de-skew, if the peaking due to the delay line is neglected, the jitter transfer function can be approximated as all-pass, that is, , and would not alter the differential jitter function. While, for systems which use PLL-PI or ILO de-skew or include a band-pass filter in the clock path, the forwarded clock jitter is attenuated by a low-pass function. For these systems, a first-order low-pass function serves as a good approximation for the jitter transfer function.

In order to illustrate how varying the jitter tracking bandwidth affects differential jitter, consider the results for jitter at a common power-supply resonance frequency of 200 MHz [26], shown in Figure 19. Low skew values result in small relative phase shifts for the jitter on the data and clock signals, allowing for minimal filtering of the 200 MHz jitter on the receiver clock and an optimum jitter tracking bandwidth near or above 1 GHz. Phase shift between this correlated jitter increases with skew, resulting in a pronounced optimal jitter tracking bandwidth for skew values above 500 ps, which is as low as 60 MHz for a skew of 1 ns. If the jitter tracking bandwidth is increased beyond this optimal point, the differential jitter increases and can even be amplified as the correlated clock and data jitter combine out of phase. This implies that, along with general system constraints (i.e., power/area consumption and wake-up time), clock-to-data skew should be considered in selecting the receiver jitter tracking bandwidth.

Optimal jitter tracking bandwidth depends on the location of the dominant jitter frequency terms, as shown by plotting the normalized differential jitter over a wide frequency range for a skew of 600 ps in Figure 20. As predicted by Figure 19, the 213 MHz optimal jitter tracking bandwidth displays the lowest normalized differential jitter for a 200 MHz jitter frequency term. If jitter is dominant at higher frequencies, the system would benefit from the use of a lower jitter tracking bandwidth to filter the jitter terms combining out of phase. However, this lower jitter tracking bandwidth will result in higher differential jitter at lower frequencies, as the 100 MHz performance is significantly worse with a 30 MHz jitter tracking bandwidth. Note that the absence of jitter filtering, or a jitter transfer function of unity, results in significantly worse differential jitter at higher frequencies, peaking at 6 dB at a frequency of 1/(2) where the correlated jitter terms combine with a 180° phase shift.

A key receiver performance metric involves quantifying the maximum amount of sinusoidal jitter the receiver can tolerate for a given bit error rate (BER) specification, known as jitter tolerance [27]. For a sampled data receiver with an ideal 0.5 UI timing margin, the maximum tolerable phase error or differential jitter is Considering Figure 18 source synchronous model results in a maximum tolerable sinusoidal jitter amplitude of [26] which will vary based on the amount of clock-to-data skew and jitter transfer function of the receiver clocking circuitry. In addition to these effects, any VCO accumulated jitter will subtract from the system timing margin: where and is the transition density, assumed 0.5 for random data signals [28]. Thus, the jitter tolerance expression is modified for the ILO and PLL-PI de-skew architectures to include the oscillator accumulated jitter, which, as discussed in Section 3, is a function of the jitter tracking bandwidth:

In order to compare the jitter tolerance performance of the key source synchronous clock architectures at different skew conditions, a 10 Gb/s half-rate architecture with a 5 GHz forwarded clock is modeled with the following assumptions for the different receiver clock circuits. The first-order model of Figure 7 is used for the DLL-PI case, while a variable bandwidth PLL with a 5 GHz reference clock () and a maximum jitter tracking bandwidth of 150 MHz is assumed for the PLL-PI de-skew. For the ILO modeling, a four-stage ring oscillator is considered with injection strength assumed to be variable from an extremely high value of [26] to a minimum value of 0.025 to allow for a de-skew resolution near 36 phase settings within 1 UI [22]. This yields an ILO jitter tracking bandwidth which ranges from 54 MHz to 1.25 GHz for the 5 GHz forwarded clock frequency. For systems which leverage a BPF, the filter is variable from 3 to 30, resulting in a potential jitter tracking bandwidth from 833 MHz to 83 MHz.

5.1. Zero Clock-to-Data Skew (0 UI)

While ensuring identical delays on the clock and data paths poses a major challenge, the zero clock-to-data skew is first considered in order to differentiate the receiver structures’ performance for systems which approach this ideal case.

An ideal DLL with a unity jitter transfer function over all frequencies would hypothetically provide infinite jitter tolerance over all frequencies for this zero-skew case. While peaking in real DLLs degrades this ideal performance, the system still tolerates multiple UIs of jitter at high-frequency, as shown in Figure 21(a). Compensating the DLL with an additional pole can further improve the jitter tolerance at high frequencies. Leveraging band-pass filtering provides the potential for jitter filtering with DLL-PI de-skew. In the ideal zero skew case, a low- band-pass filter provides improved DLL compensation in the 100–400 MHz range at the cost of increased amounts of in-phase correlated jitter being filtered at higher frequencies. If the BPF is increased to a high value, the jitter tolerance degrades over all frequencies due to an increased amount of this correlated jitter filtering.

The limited PLL bandwidth causes the PLL-PI de-skew to have less jitter tolerance relative to the DLL-PI architecture, as Figure 21(b) shows the jitter tolerance falling below 1 UI near 200 MHz for the maximum 150 MHz PLL bandwidth. Performance degrades further as PLL bandwidth is reduced due to increased filtering of in-phase correlated jitter and also additional VCO jitter accumulation subtracting from the overall timing margin.

Figure 21(c) shows a similar trend for the ILO architecture, with the maximum 1.25 GHz tracking bandwidth at a 0° de-skew setting achieving more than 10 UI jitter tolerance at 200 MHz. Changing the ILO free-running frequency to obtain an output phase shift results in a lower jitter tracking bandwidth and increased oscillator jitter accumulation. Note that, for moderate output phase shifts, the overall high tracking bandwidth and lack of jitter peaking still allows the ILO to outperform the other de-skew architectures up to near 500 MHz. However, if an extreme phase shift is required, the dramatically reduced jitter tracking bandwidth and oscillator jitter accumulation causes the ILO system to have worse jitter tolerance than the DLL-PI architecture for frequencies greater than 50 MHz.

5.2. Low Clock-to-Data Skew (2 UI)

Introducing a low skew value of 200 ps degrades the jitter tolerance performance for all the presented de-skew architectures, as shown in Figure 22(a). DLL-PI without band-pass filtering and ILO de-skew display similar performance and can tolerate 2 UI of jitter near 200 MHz. Here, the ILO bandwidth is reduced to 700 MHz to compensate for the 200 ps skew. The PLL-PI de-skew, with maximum bandwidth setting of 150 MHz, displays the lowest jitter tolerance at this low skew value due to excessive filtering of in-phase correlated jitter.

5.3. Mid Clock-to-Data Skew (5 UI)

An increased amount of jitter filtering benefits the jitter tolerance performance for systems with a moderate skew value of 500 ps. A DLL-PI-based de-skew system that includes a BPF with a of 9, resulting in an overall jitter tracking bandwidth of 275 MHz, provides the best performance at 200 MHz with jitter tolerance close to 0.9 UI. The performance is similar for the DLL-PI system without band-pass filtering at low frequencies, but begins to diverge above 100 MHz due to inadequate filtering of out-of-phase correlated jitter. ILO de-skew jitter tolerance is slightly degraded relative to the DLL system due to increased oscillator accumulated jitter associated with the reduced jitter tracking bandwidth. While the 150 MHz PLL-PI system still performs the worst at low and moderate frequencies, it does offer superior jitter tolerance relative to the stand-alone DLL and ILO systems for jitter frequencies above 400 MHz due to increased filtering of this out-of-phase correlated jitter.

5.4. High Clock-to-Data Skew (10 UI)

As skew is increased to 1 ns, the PLL-PI-based de-skew with a 65 MHz bandwidth provides more comparable performance to the other de-skew architectures. While peaking in the PLL transfer function degrades the jitter tolerance at the lower frequencies, the PLL-PI system achieves 0.3 UI jitter tolerance at 200 MHz and superior performance relative to the stand-alone DLL for jitter frequencies above 300 MHz. At 200 MHz, the ILO de-skew with 65 MHz bandwidth and the BPF-DLL-PI with 83 MHz bandwidth achieve the best jitter tolerance of 0.5 UI. Here an active-inductor-based band-pass filter is assumed to achieve the of 30 required for the 83 MHz bandwidth.

6. Conclusion

This work presented an analysis of key channel effects that impact the jitter performance of high-speed source synchronous links. Skew between the clock and data signals degrades source synchronous system timing margins, motivating the use of receiver clock circuits that provide filtering of high-frequency jitter components that would otherwise combine out-of-phase and increase differential jitter. High-frequency channel loss characteristics influence system forwarded clock frequency choice, as the differences in the loss slope impacts the amount of high-pass jitter amplification.

Also discussed was the tradeoffs in complexity and jitter tracking properties of common receiver de-skew circuits, along with how band-pass filtering can be leveraged to provide additional jitter filtering. Jitter tolerance modeling indicates that an all-pass DLL-PI or high jitter tracking bandwidth ILO structure performs best in low skew systems. For systems with large amounts of skew, PLL-PI de-skew becomes competitive with low-bandwidth ILOs and DLL-PI systems which leverage additional clock band-pass filtering. Overall, ILO-based de-skew holds the potential for high jitter tolerance over wide skew ranges at a low complexity level relative to other receiver clock topologies.

Acknowledgments

The authors would like to thank Yohan Frans, Brian Leibowitz, Jihong Ren, Sam Chang, and Masum Hossein of Rambus and Younghoon Song of Texas A&M University for advice and comments on this work. This work was supported by SRC Grant 1836.060.