ISSCC 2007 / SESSION 22
/
DIGITAL CIRCUIT
INNOVATIONS
/
22.9
22.9
A
0.28pJ/b 2Gb/s/ch
Transceiver in 9Onm CMOS
The schematic of the receiver implementation is shown in Fig.for
10mm
On-Chip interconnects
22.9.3. The left of thesense-amplifier-basedflip-flopdiagram
(SAFF),shows a clocked comparator, awhich consists ofadiffer-Eisse Mensink. Schinkel
EricKlumperinkEdvanTuiential
input
stage,cross-coupled
inverters and anSR-latch. TheFisse Mensink,
Daniel
Schinkel,
EricKiumperink,
Ed vanTuiji,
outputs of the SR-latch drive the low-pass feedback filter, in thisBrain N\lauta case an RCfilter, implementedwithpass-gates and anti-parallel
University ofTwente, Enschede, The Netherlands gate capacitances. The filter output is
coupled
back into the SAFF via a second differential input stage, as shown on the right ofFig.
22.9.3.IEQ
setsthe feedbackgain
A(see
Fig.
22.9.2).
The The bandwidth ofglobalon-chipinterconnects inmodern CMOS total area of the receiver is 117 gm2(324m2
for the DFE part) processes islimitedby theirhigh resistance andcapacitance
[1].
Repeaters thatareusedto
speed
uptheseinterconnectsconsume The chip micrograph is shown in Fig. 22.9.7. The 10mm-long aconsiderableamountofpower[2] andarea.Recently
published
interconnects, placed in metal 4, have a total distributed resist-techniques [1-4] increasethe achievable datarate atthecostof ance of2kQ
and a capacitanceof2.8pF.
Theother metal layers highstatic powerconsumption,leading
torelatively
high
energy are filled with GND- andV,,-connected
metal stripes. An exter-per bit for low dataactivity.
On the otherhand,
low-swing
nal pattern generator/analyzer generates data and measures schemes [5] often sacrifice bandwidth for powerreduction,
or BER. The receiver clock is generated externally to adapt its make useofan extra low-voltage powersupply.
Moreideally,
a phase to the eyepositionand to be able to measure eyewidths.In transceiverwould combine lowdynamic
andstatic powerwith a an application, a simple skew circuit or a source-synchronoushigh achievable datarate. approachcould be used to generate theproperclock phase.
Eye-diaglrams alremeasured via 50Qoutput
buffercs
thatalre
connect-The bandwidth and powerconsumption ofanRC-limited inter- edtotheoutput ofadifferenetial
intercontnaect.
connect depends onits source
(Zs)
and loadimpedances
(ZL).
InFig. 22.9.1, aconventionalcase withaninverterusedasboth a Figure 22.9.4 shows a measuredeye diagram at a data rate of transmitter (Zs =
100Q)
and a receiver(ZL
=10fF)
hasonly
lGb/s.
Themeasured BER at the edges of the eye is also shown. 62MRz bandwidth andhigh
powerconsumption.
Current-seas- TheBER drops rapidly below a clock skew of-150ps
and above ingschemes(ZL=190Q inFig.22.9.1)
increasethe bandwidthup180ps,
giving aneye-openingof 670ps. Data rates up to1.35Gb/s
to 3x [1,4], but with increased power atlow dataactivities. We are achieved without DFE (IEQ=O). The one-6 offset of the total propose using a
capacitive
transmitter(Zs
= 255fF inFig.
transceiverisllmV,
measuredover 20samples. Due to thisoff-22.9.1), which has the samebandwidth
improvement
ascurrent set, not all samplesachieve1.35Gb/s,
but all samples do achieve sensing, but with lowerpowerand without staticpower consump- aslightly lower data rate oflGb/s.
Simulationsover processcor-tion. ners alsoindicate that the circuit is robust to PVT variations at a
Trate
slightly lower than themaximumachievable datarate.Data Thispaperpresents atransceiverfor 10mmlong
inte9connects
in
rates up to 2Gb/s aremeasuredwith DFE. Fig. 22.9.5 shows that a 1.2V 90nm 6MCMAOS
process, showntin Fig. 22.9.2. Acapaci- DFEimproves the eye opening for a wide range ofIER.
InL
an appli-tivepre-emphasistransmitterbothincreasesthe bandwidth and DFE canthereye
be
foradesign
Ie.
decreases the
voltage
swing,without the need foran additionalcation, 'EQ
cantherefore be fixedatdesign
time.power supply. As
low-swing
signaling
is moresusceptible
to In Fig. 22.9.6, the measured energy per bit is plotted as afunc-crosstalk, we use differential interconnects with twists
[1],
of tion oftransition probabilityatdifferentdatarates.With random whichonlyasingle-ended
halfis shown. Incontrast tothe wide data at2Gb/s,only0.28pJ/b
isdissipated,which is 7x lower than interconnects used in [2,3], we userelatively
small width earlierwork[1,4].
The powerdissipationof0.12pJ/b
at zero data (0.54,um) andspacing
(0.32gm) [1,4]
andassumehigh
metal-den- activityismainlydue to the powerdissipationinthe SAFF, which sitysurroundings.
Thereceiverusesdecision feedbackequaliza-
has largetransistorstoget a low offset(6s =8mV).
Clock-gating tion (DFE) [6]tofurther increasethe achievable data rate. The canbe usedtoeliminatepowerconsumption
during
inactive pen-DFE,withacontinuous-time feedbackfilter,
consumesalmostno ods. The DFE part of thecircuitrequires less than 7% of the total extrapower.Thebandwidth-increasing
pre-emphasis
effect of the transceiverpower, while it can increase theachievabledata rate transmitter is shown at the bottomright
ofFig.
22.9.2: every 1.5x.transition is
emphasized
by
thetransmitterby injecting
acharge
viacapacitance
Cs.
With
the presented transceiver, the same high data rates oversmall RC bandwidth limited
on-chip
interconnectsarepossible
asWith only a series capacitor(AC-coupling), the DC voltage on the with
plrevious
solutions, but with a7x
lowerpower consumption. interconnect isnotwell definedasthere isnoDCpath
tooneof Byusing both
acapacitive pre-emphasistransmitter and
contin-the supplies. Tocontrol the DC
voltage,
aloadresistorRL and auous-time
DFE, adata rate
of2Gb/s
isachieved over
a 10mm transconductanceG,,,
controlledby
Vi.,
areadded(see Fig.
22.9.2).
long
interconnect.
Thetransceiver
consumes
only0.28pJ/b.
Byhaving the timeconstantsC/G,,
andRLCwi,,
equal,
thetrans-fer function resembles the transtrans-fer function of the
capacitive
Acknowledgements:transmitter in Fig. 22.9.1. Ifa small
G,,,
(5gS)
and alarge RL
We thankPhilips
Research forchip fabrication,
the DutchTechnology
(16kQ) arechosen, thestaticcurrentis
kept
small(6gA)
and also Foundation (STW, projectTCS.5791)
forfundingand Gerard Wienk for the power consumption remains similar.Gm
andRL
areimple-
assistance.mented with MOStransistors asvisibleinthe bottompart of
Fig.
22.9.2. Folr
Cs,
thegate
capacitanlce
ofan NMOS transistor is[11
D.Schinkel,
E. Mensink, E. A. M. Klumperink, etal., "A3-Gb/s/ch used. As the gateoxideismuch thinner than the oxide between Transceiver for 10-mm UninterruptedRC-limited
Global On-Chip interconnects, the area consumedby
C,
isrelatively
smallInterconnects,"
IEEEJ.Solid-State
Circuits,
vol. 41,no. 1,pp. 297-306,(6x6gM2). The
signals,
withavoltage swing
of100mV,
arechosen Jan.,2006.closeto
V,
of1.2V,
because thecapacitance of the NMOStransis- [21A. P. Jose, G. Patounakis, and K. L. Shepard,"Pulsed
Current-Modetoris
highest
forahigh
gate-sourcevoltage.
The totalareaof the Signaling for NearlySpeed-of-Light Intrachip Communication," IEEEJ.differential transmitteris 226 gM2
.Solid-State
Circuits,vol. 41,no. 4, pp. 772-780,Apr.,
2006.[31
A.P.
Jose,and
K. L.Shepard,
"Distributed
LossCompensation
for Low-LatencyOn-ChipInterconnects,"ISSCCDig. Tech.Papers,pp.516-Thereceiverconcept is also showninn Fig. 22.9.2.Aclocked colm- 517,
Feh.,
2006.parator restores the low-swing line output to full swing. DFE fur-
[41
L. Zhang, J.Witson,
R.Bashiruttah,
etat,,
"DriverPro-Emnphasis
therilncreases
the achievable data rate. Instead of theofteln-used
Techniques for On-ChipGlohal
Buses,"ISLPED, pp. 186-191, Aug., 2005. FIRfilters
[6], acontinuous-timne
filter
operates as the decision[51
H.:
Zhang, V. George, and J. M.Rahaey,
"Low-Swing On-Chip fedbcfi1ltr
Thi flter
cacl mos of the ISI with a1iml
Signaling Techniques: Effectiveness and Rohustness,"IEEEThanrs.
VLSI anpoe-fiin fis-re imlmnain whraan( owerettelentlst-ofe lmpemetatln, nLeeasann FIq-4 Systems, vol. 5, pp. 264-272, Jun., 2000.[61 V. Stojanovic, A.HIo,
B.Gartlepp,
et at,, "Adaptive Equatizationandf1iter
requires
manytaps.
Data Recovery in aDuat-Mode
(PAM2/4) Seriat
LinkTransceiver,"
Symp.VLSICircuits, pp. 348-351, Jun., 2004.
ISSCC 2007
1
February 14, 2007 /12:00 PM
Conventional: 3 .3 ^
lGbps
S 40
2R
1000
VL
'
BW=$80
62MvHz
JaIouV >ff|20H ti |cpctvpeepai troncalclcecmaarwit
vs
OlfFT
Q
-12010 1bOan
bs
10 0 05 1frequency(Hz) transitionprobability
Gm*Vi
nCurrent-sensing:
BW 1bpCapacitive transmitter: '4 |~~~~1
Gbps
VoQ
Rl
CO
g- I _ 8~~~~~~~-01 22MZ\{n1 5|o9
1000
VL
,
_W
r120
0Mzcapacitive
pre-emphasis interconnect and clockedcomparatorwithVST
> -120 w~~~~~~~~~~~~~transmitter
biasing continuous-time190flo,~1
10,
I0 0 0.5 1 feedback filterfrequency(Hz) transitionprobability circuitimplementation: V
DD1.4V .k
Capacitive
transmitter: 0 1Gbps 255fFAU40
2V
LKK | ' t RC = ~~~L W °i|_n_1
_ 1ns /O\ 1n-8O
220MHz
L4
VV[.jjv
VL
. 1OfFT
-12016 a 10 010
10
0.5 1VLR1\
frequency (Hz) transitionprobabilityIVL2J
=~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~11
>_t
Wf3-i
eFigure
22.91: Bandwidth and energy per bitversustransitionprobability
(=
data activ- tmity)
forthree different termination schemes. The resultsarefor 10mmdifferential inter-Figure
22.9.2:Concept
of transceiver and circuitimplementation
of thecapacitive
pre-connectswithadistributed resistance of 2kil andadistributedcapacitance
of2.8pF.
emphasis
transmitter..~~1L,
HF;'
r/
.. .2 Seedbackfitense
ampliteer
RC~~~~
atheoedgesuoftte0InEye-opening Measurements |BitError RateMeasurements
IVfb+ Vfb- 1 ° 1-________IEQ _
w
U-Vini1
'in-
Ha,
11I150
100 -50 0 50 150.# .X.}....clock..25GI}I5
delay
(ps)
Figure
22.9.3:Implementation
of the clockedcomparator
with continuous-timeFigure
22.9A4 Eye-diagram
attheinput
of the receiver at1Gb/s
and measured Bit Errorfeedback filter. Rate at the
edges
of theeye.Eye-opening
Measurements PowerConsumption
Measurements600~~~~~~~~~~~~~~~~~05
...0.25Gb/s,
I'EQ0=-t ----s---r
0.45
--e-0.50Gb/s,
E0
5200r
EQS11°2 .''.-#t---. 500 /-| >rte-=1.OQGbI:i/°
data =0 0.40-EQ-=.-1.25Gb-
.- sE.0
0I35
; Sdt.../1.5Gb/s,
IEQ75gA
O --El-- I-r=.1.75Gb/s,
'E =1 g 1 20 40 60 BO 100 ° 0.1 0.2 O.3 0.5~~00.4 0 E0-,~~~~~~~~~~~~~~~~~~~~~~~~0
~2.00G/
EQ (U20.25
'-e-data
rate =1.25 Gb/s 00 0.1 -e- at at 13 G/-2-data
rate =1.50Gb/s
Figure