Clock and Data Recovery/Introduction/Jitter is far from sinusoidal..

The study of the CDR system frequency responses (that is the functions that have j $\omega$ as the independent variable) gives in actuality a representation of the CDR behaviour in the presence of sinusoidal jitter.

Sinusoidal jitter is best for modeling, for measuring and to retain some margin, but real jitter is essentially noisy

Some periodicity in the bit stream due to framing, or to periodic data patterns, may induce -via intersymbol interference- periodic components of jitter, but the real jitter will be essentially noisy.

- in general, system descriptions (like jitter, error, jitter transfer, noise transfer and the jitter tolerance) in the frequency domain are the most used, effective and are familiar to all electronic engineers
- in particular, sinusoidal jitter represents the worst case jitter with respect to jitter tolerance.
  Sinusoidal jitter can be seen as the concentration of the total of the jitter power at a single frequency!

Therefore sinusoidal jitter considerations are extremely useful in most cases ( and often even give some margin with respect to the reality of the circuit application! ).

The use of the CDR system frequency responses, that is the most important tool used in this book, may therefore be seen as a pretty good and safe tool to describe the CDR behavior.

Jitter has different spectra at different points of the CDR

A clock is ideally a pure tone (a line in the spectrum) but in practice is always a slightly jittered sine wave(a line collapsed into what looks very much - although still very high and thin - like the bell of a Gauss distribution).

In serial data communication, the transmitted signal spectrum does not normally exhibits any line at the clock frequency.

The data bits are transmitted without wasting transmit power for the waveform that clocks them, in all practical types of line codes.

NRZ is the most common choice, and by far so in very high speed signal transmissions.

At the clock frequency, the received signal spectrum exhibits a minimum (a theoretical zero).

The prime task of a CDR is the recovery of the clock, so that the data recovery can follow.

As a result, the study of CDRs deals mostly with jitter processing.

The jitter, at the different nodes of a CDR, appears in very different forms.

The linear models describe the phase (=jitter) signals at the various nodes of the PLL, but they describe it ALWAYS in the BASEBAND.

At the circuit nodes where the actual signal is a modulated, high frequency signal, the model just describes the modulating signal.

Linear processing (using mathematical linear models) neither generates nor fully cancels any frequency component of the signals described, even though different frequency components of the signals may be differently amplified or attenuated, Modulation and demodulation are instead inherently non-linear processes and they translate spectra to different frequency bands.

The spectrum of the jitter is extracted (as phase difference between the phase of the clock inherent to the received signal and the phase of the CDR local clock) by some essentially non-linear processing, and shifted to the baseband at the same time (phase comparison).

The two input signals to the Phase Comparator are to be seen as FM signals with the respective jitters as modulating signals, and the signal at the comparator output is to be seen as the baseband jitter difference signal (the phase comparator acts as a synchronous phase demodulator).

The jitter signal in the baseband is then filtered.

Finally, the filtered jitter is translated back upwards as close as possible to the line frequency using a FM modulator (the VCO).

The baseband low-pass filtering (from the third to the fourth frequency diagram in the figure above) can be concisely characterised by its equivalent Q factor.^[1]

A linear wander must often be included to model and to simulate correctly

A CDR always deals with two clocks: the one embedded in the received data stream, and the local clock.

Sometimes (e.g. in phase aligners) the local clock is kept synchronous with the clock embedded in the received data stream because they belong to the same clock domain.

In other cases (e.g. in regenerators and end point CDRs), the local clock would run at a frequency of its own (uncorrelated with the incoming frequency of the remote master, embedded in the received data stream), unless forced by the PLL to deviate from its free-running frequency and to lock itself to the incoming signal.

In these cases the representation of the circuit shall either take into account this frequency difference in the input signal, description, or in the circuit description

More precisely, either:

the input jitter shall include a component (= a phase ramp):

\phi (t)=(f_{p}-f_{fr})t

[ rad ]

where f_p is the frequency of the received pulses (i.e. the frequency of the clock embedded in the received pulse stream) and f_fr is the free-running frequency of the local oscillator,

or f_fr in the model of the VCO center frequency shall be correspondingly offset from f_p.

Jitter is a discrete time variable, and some jitter samples -as often as not- may be missing!

In a NRZ signal, a transition occurs when a bit 1 is followed by a bit 0, or viceversa.

In a digital transmission, the incoming signal carries its phase information by its transitions from one level to another.

The NRZ bit stream (the most often used for data transmission) has a variable density of transitions.

(Encoding or modulation may be used, but the problem of variable density of signal transitions, although mitigated, remains).

f_p is the upper limit of linear models

A good part of all CDRs (mostly CDRs based on 2nd order architectures) react to the input phase variations only at frequencies significantly lower than the line pulse frequency f_p.

Some aspects of the CDR may be studied without exploring the range of frequencies much higher (or the range of time intervals much lower) than the range important for the closed loop operation.

linear models are useful when the PLL bandwidth is much tighter than fp

The linear models of CDRs can be seen as models of discrete time systems, that are sampled at the frequency of the received pulses f_p.

The jω representation of the system behavior is meaningful up to jω_p (jf_p = jω_p/2π ), and repeats itself periodically at higher frequencies, without additional information to offer.

simulation is needed to investigate faster systems and transients

the time steps for a numerical simulation can be chosen even smaller than the line pulse period, to investigate what happens in the short intervals between signal transitions.

Sometimes the study at frequencies lower than f_p can be further simplified using normalised frequency and time scales (both in linear models or in numerical simulations)

The simplification associated with normalization of the frequency and time scales may still be adequate for the investigation.

The normalisation is made, for instance, with values of ω/ω_n, or ω/ω_n2, on the x axis.

This approach brings the focus on the fundamental concepts: concepts that become then easier to grasp, remember and use.

De-normalization puts the actual value of the natural angular frequency ω_n, in the place of the value of 1 rad/sec.

The time τ that appears in the time functions shall be readjusted in the same way, rescaling 1 sec to its actual value of 1/(ω_n) sec.

The angular frequency with which the signal pulsed are received, ω_p does not always appear in models/simulations.

In fact the frequency of the received pulses f_p ( f_p = ω_p/(2π)) is a fundamental circuit characteristic for the phase comparator and for the VCO.

The filter is the only block that does not have to process frequencies around f_p, and very often only processes frequencies at least one or two decades lower that f_p.

With closed loop, the study of the relation between phase of the input signal and phase of the output signal in the ω domain may be restricted to just the range of frequencies where the filter influences the loop!

Not infrequently, the actual value of f_p is neglected in some descriptions of the PLL operation. In many cases, it does not appear in the linear modelling equations.

All in all, the line pulse frequency f_p does not always appears in this book when one may expect! See for instance the last figure above.

The frequency of the jitter samples fp is jittered

The time instant of samples is jittered to start with.

The time interval between sample instants is not exactly constant - or else there would be no jitter!.

This is normally not a big concern in itself, and can often be neglected.^[2]

What really complicates things is the following point regarding missing transitions:

jitter information only comes when there is a level transition in the received signal

This is the big problem with CDRs (a problem that does not affect frequency synthesisers).

What happens when the received signal does not present, from a certain instant on, any more transition?

The phase comparator, the first block that compares input and feedback signals, cannot any longer make meaningful comparisons, and the PLL finds itself into “open loop” condition.

The loop phase error may drift and increase for some time, but a good lock will be found again if the phase information (signal level transitions) reappears shortly.

In the meantime the recovery (=regeneration) of the received pulses (all of the same level!) takes place correctly.

In all slave CDRs the sampling instant drifts away from the optimal position that corresponds to the locked condition of the PLL,

while the feedback loop is open and can no longer keep the local oscillation phase locked to the input phase.

If the condition persists long enough, errors may appear in the data flow that comes out of the CDR, and even slips finally take place (slip = a lost or an added clock cycle with respect to the transmit clock).

Either an alarm circuit intervenes to declare a LOS^[3] (and replaces the regenerated bit flow with a telling pattern like an AIS = Alarm Indication Signal) , or the CDR drifts to its free-running frequency and its output is unreliable!

(The slips that inevitably take place in that condition may generate by themselves a problem. If a section of the network downstream is slaved to this CDR, errors will result wherever (somewhere upstream for instance) the original clock domain is re-entered.).

As long as the lack of transitions persists, the situation worsens until the frequency of slips corresponds to the difference:

f_p frequency of received pulses – f_fr frequency of the free running sampling frequency.

In some extreme cases, where the above difference is small, some transmission in the slave network section may still be possible, but with a heavy penalty in terms of error-ed bits and slips (= frequent retransmission of the blocks that are detected in error).

Start of reception

The CDR acquisition phase starts when a signal with sufficient power is detected by the LOS (Loss Of Signal) block, more precisely at the moment when the LOS is de-asserted. Acquisition time cannot be used for transferring useful bits of information bits because it may be affected by a significant number of errored bits.

1. Burst mode. If the transmission system is meant to operate with intervals of no signal power alternated with intervals of normal transmission (Burst Mode System), the acquisition takes place very often. Not to take up a significant percentage of the total burst duration, acquisition must be fast and shall last no more than 20 to 50 line pulse periods. In burst-mode applications the CDR is based on a 1st order control loop. This type of implementation is best fit for fast acquisition, but is not remarkable in presence of micro-interruptions nor of long sequences of pulses without level transitions.
2. Continuous mode. If instead the start of reception is a exceptional occurrence in the system (Continuous Mode System), it may last correspondingly much longer without significant penalisation of the system efficiency. The CDR is often based on a 2nd order control loop, slower in the acquisition phases, but more resilient to long sequences of pulses without level transitions and to micro-interruptions.The acquisition is less repeatable and the circuit design and testing is more difficult.

Micro interruptions and Holdover

Holdover mode.^[4] The circuit forces the PLL open during LOS assertion,
and lets the VCO to continue free-running without phase discontinuity, until the signal reappears.
At LOS dis-assertion, the VCO may have drifted very little and
data regeneration can restart sampling with a still acceptable sampling time error.

Incomplete and irregular flow of incoming level transitions

A good part of all the possible transitions do not materialise because the level of the two pulses (before and after each instant of a possible transition) is the same.

If there was a transition every time it was be possible, the bit pattern would be predictable: when it is predictable it cannot carry information!

Level transitions take place in the transmitted signal less often than possible, and unpredictably.

These characteristics are a consequence of the randomness of the bits to be transmitted.

They can only be described statistically. The most used statistical quantities are:

For long term descriptions, parameters that describe average or integral properties:

- the transition density D_T (average probability of a signal level change between adjacent pulses of the signal), that corresponds to the concept of the run-length, but from the long-term average point of view.

With reference to a given length of a serial signal, and in the hypothesis that it is the average that matters :

D_T = actual number of transitions / maximum amount of transition [0% to 100%].

- the running disparity (running disparity: integral sum of the opposite polarities of all received bits), that measures the DC content of the received signal, and is especially important in burst-mode receivers.

For short term descriptions, two parameters that describe the peaks of irregularity:

- the maximum run length (the number of contiguous bit (runs) during which the signal does not change level), that concides with CID (consecutive identical digits) in NRZ coding) ;

- the latency of circuit blocks ( its effects cumulate with those of the run-length of the received signal to limit the response of the PLL).

In all practical cases, the original bit stream is processed before transmission, to make it easier to recover by the CDR.

Apart from modifying the NRZ coding with some sort of modulation (used to adapt NRZ spectrum to a transmission medium that is band-pass and/or frequency-dependent), the most common techniques for NRZ are:

- a change of the bits, according to a fixed rule that changes each bit taking into consideration it and a limited number of preceding bits, called scrambling

- an encoding that increases by some percent the bit rate of the original bit stream, either

- FEC, or

- line codes.

The most important line codes are presented in the following table:

Code	Max run-length	Transition density	Notes
Classic old SONET systems	80 bit time intervals	50% average	framing bits and scrambling of payload
64/66	65 bit time intervals	50 % average	2 framing bits and scrambling of payload;1900 years @ 10 Gbps
8B10B	5 bit time intervals	30% min, 80% max	running disparity max 2

With a rough approximation, the run-lengrth may affect the closed loop performance of the CDR in proportion to its value multiplied by: 1 / (f_p - f_fr ).

Depending on the specific CDR implementation, the jitter generation may be the performance more affected, or the jitter transfer, or the jitter tolerance.

In any case, the minimum and maximum values of D_T that are specified for the application considered, do influence significantly both the design and the testing of a CDR.

The worst-case maximum run-length that must be tolerated in practice is 72 CID (Consecutive Identical Digits) in SDH systems,^[5] or something equivalent to not much more than 72 CID in optical links.^[6] A few repeated blocks of 72 CID, separated by just one or a few digits of opposite polarity, are unlikely but possible.

As a result, CDRs are frequently specified, designed and characterized in order to stay in phase lock even when input phase information disappears for a significant number of line pulse periods (hundreds or even a few thousands).

External References

↑ Aaron BUCHWALD and Kenneth W. MARTIN : INTEGRATED FIBER-OPTIC RECEIVERS, 1994 by Kluwer Academic Publishers http://course.ee.ust.hk/elec692e/IntegratedFiberOpticReceivers.pdf 4.4.3 Using PLLs to Synchronize a VCO to the Data Rate
.... A PLL with a lag-lead loop filter such that the closed-loop transfer function is of the second order with a damping ratio of ζ = 1/√2, and a natural frequency of fn =5 KHz, that locks into a clock tone of 10-GHz, exhibits an effective Q of approximately.

QPLL = 10 GHz / 2x5 kHz = 10⁶

This effective Q can be interpreted considering that the PLL averages the phase-error over several cycles; in this case it takes approximately one-million clock-cycles before the loop filter can accumulate a large enough signal on the VCO control line to begin tracking an input phase deviation.
The PLL can also be seen as a flywheel that is spinning at a rate close to the data rate.
The flywheel has a timing mark on it. The input data signal acts like a strobe light that flashes every time that a data transition is detected, revealing the current phase-error of the timing mark.
The loop feedback is used to align the timing mark, as revealed by the strobing flashes, to the desired position. In other words the rising edge of the local clock, that corresponds to the timing mark, should typically lock its phase half a revolution away from the transitions, i.e. 180° out of phase with them.
Increasing the time constant of the loop filter is analogous to increasing the mass of the flywheel. A narrowband loop acts like a very heavy flywheel that takes a lot of energy to alter its momentum.
The effective Q of a band-pass filter is determined by how many cycles a free running response can oscillate (point of view in the pass-band), in a PLL the effective Q is determined by how many clock cycles it takes for the VCO to respond to a phase error (point of view in the base-band). ....
↑ Richard C. Walker (2003). "Designing Bang-Bang PLLs for Clock and Data Recovery in Serial Data Transmission Systems" (PDF). pp. 34-45, a chapter appearing in "Phase-Locking in High-Performance Systems - From Devices to Architectures", edited by Behzad Razavi, IEEE Press, 2003, ISBN 0-471-44727-7, page 3: "an analysis assuming uniform time steps of t = 1 ⁄ f is sufficiently accurate for most purposes.".
↑ http://www.computerhope.com/jargon/l/los.htm "Short for Loss Of Signal, LOS is an indicator on a networking device that shows a signal or connection has been dropped or terminated. LOS can occur for many reasons such as, the cable connected to the network device is bad, there is no connection on the other end, improper network configuration, or the device itself is bad."
↑ Invalid <ref> tag; no text was provided for refs named G 810 holdover def.
↑ G.783-2006 03 Characteristics of synchronous digital hierarchy (SDH) equipment functional blocks; Appendix V Verification of SDH equipment CID immunity, page 275; 72 bits …
↑ G.957-200603 Optical interfaces for equipments and systems relating to the synchronous digital hierarchy; Appendix II, Implementation of the Consecutive Identical Digit (CID) immunity measurement

← Clock and Data Recovery/Introduction/Definition of (phase) jitter

Clock and Data Recovery

Clock and Data Recovery/Introduction/Models can only be linear.. →

[1] Aaron BUCHWALD and Kenneth W. MARTIN : INTEGRATED FIBER-OPTIC RECEIVERS, 1994 by Kluwer Academic Publishers http://course.ee.ust.hk/elec692e/IntegratedFiberOpticReceivers.pdf 4.4.3 Using PLLs to Synchronize a VCO to the Data Rate
.... A PLL with a lag-lead loop filter such that the closed-loop transfer function is of the second order with a damping ratio of ζ = 1/√2, and a natural frequency of fn =5 KHz, that locks into a clock tone of 10-GHz, exhibits an effective Q of approximately.

QPLL = 10 GHz / 2x5 kHz = 10⁶

This effective Q can be interpreted considering that the PLL averages the phase-error over several cycles; in this case it takes approximately one-million clock-cycles before the loop filter can accumulate a large enough signal on the VCO control line to begin tracking an input phase deviation.
The PLL can also be seen as a flywheel that is spinning at a rate close to the data rate.
The flywheel has a timing mark on it. The input data signal acts like a strobe light that flashes every time that a data transition is detected, revealing the current phase-error of the timing mark.
The loop feedback is used to align the timing mark, as revealed by the strobing flashes, to the desired position. In other words the rising edge of the local clock, that corresponds to the timing mark, should typically lock its phase half a revolution away from the transitions, i.e. 180° out of phase with them.
Increasing the time constant of the loop filter is analogous to increasing the mass of the flywheel. A narrowband loop acts like a very heavy flywheel that takes a lot of energy to alter its momentum.
The effective Q of a band-pass filter is determined by how many cycles a free running response can oscillate (point of view in the pass-band), in a PLL the effective Q is determined by how many clock cycles it takes for the VCO to respond to a phase error (point of view in the base-band). ....

[Richard_C._Walker_article-2] Richard C. Walker (2003). "Designing Bang-Bang PLLs for Clock and Data Recovery in Serial Data Transmission Systems" (PDF). pp. 34-45, a chapter appearing in "Phase-Locking in High-Performance Systems - From Devices to Architectures", edited by Behzad Razavi, IEEE Press, 2003, ISBN 0-471-44727-7, page 3: "an analysis assuming uniform time steps of t = 1 ⁄ f is sufficiently accurate for most purposes.".

[3] ttp://www.computerhope.com/jargon/l/los.htm "Short for Loss Of Signal, LOS is an indicator on a networking device that shows a signal or connection has been dropped or terminated. LOS can occur for many reasons such as, the cable connected to the network device is bad, there is no connection on the other end, improper network configuration, or the device itself is bad."

[G_810_holdover_def.-4] Invalid <ref> tag; no text was provided for refs named G 810 holdover def.

[5] G.783-2006 03 Characteristics of synchronous digital hierarchy (SDH) equipment functional blocks; Appendix V Verification of SDH equipment CID immunity, page 275; 72 bits …

[6] G.957-200603 Optical interfaces for equipments and systems relating to the synchronous digital hierarchy; Appendix II, Implementation of the Consecutive Identical Digit (CID) immunity measurement

[1]

[2]

[3]

[4]

[5]

[6]