• No results found

3D DfT architecture for pre-bond and post-bond testing

N/A
N/A
Protected

Academic year: 2021

Share "3D DfT architecture for pre-bond and post-bond testing"

Copied!
9
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

3D DfT architecture for pre-bond and post-bond testing

Citation for published version (APA):

Marinissen, E. J., Chi, C. C., Verbree, J., & Konijnenburg, M. (2010). 3D DfT architecture for pre-bond and

post-bond testing. In IEEE 3D System Integration Conference 2010, 3DIC 2010 [5751450]

https://doi.org/10.1109/3DIC.2010.5751450

DOI:

10.1109/3DIC.2010.5751450

Document status and date:

Published: 01/12/2010

Document Version:

Accepted manuscript including changes made at the peer-review stage

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be

important differences between the submitted version and the official published version of record. People

interested in the research are advised to contact the author for the final version of the publication, or visit the

DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page

numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at: openaccess@tue.nl

providing details and we will investigate your claim.

(2)

3D DfT Architecture for Pre-Bond and Post-Bond Testing

Erik Jan Marinissen

1

1 IMEC vzw Kapeldreef 75 B-3001 Leuven Belgium erik.jan.marinissen@imec.be

Chun-Chuan Chi

2 2

National Tsing-Hua University Dept. Electrical Engineering

Hsinchu 30013 Taiwan, ROC ccchi@larc.ee.nthu.edu.tw

Jouke Verbree

3

3

Delft University of Technology Dept. Computer Engineering

Mekelweg 4, 2628CD Delft The Netherlands j.verbree@student.tudelft.nl

Mario Konijnenburg

4 4 Holst Centre/IMEC High Tech Campus 31

5656AE Eindhoven The Netherlands mario.konijnenburg@imec-nl.nl Munich, Germany – November 2010

Abstract

Process technology developments enable the creation of three-dimensional stacked ICs (3D-SICs) interconnected by means of Through-Silicon Vias (TSVs). This paper presents a 3D Design-for-Test (DfT) architecture for such 3D-SICs that allows pre-bond die testing as well as post-bond stack testing of both partial and complete stacks. The architecture enables on a modular test approach, in which the various dies, their embedded IP cores, the inter-die TSV-based interconnects, and the external I/Os can be tested as separate units to allow flexible optimization of the 3D-SIC test flow. The architecture builds on and reuses existing DfT hardware at the core, die, and product level. Its main new component is a die-level wrapper, which can be based on either IEEE Std 1500 or IEEE Std 1149.1. The paper presents a conceptual overview of the architecture, as well as implementation aspects. Experimental results show that the implementation costs are negligible for medium to large dies.

1

Introduction

Three-dimensional stacking of multiple integrated circuits has benefits in terms of combining heterogeneous technologies and achieving a small footprint. The semiconductor industry is preparing itself to make a ma-jor step forward in three-dimensional stacking, now that the technology of TSVs is becoming available [1–3]. TSVs are conducting nails which extend out of the back-side of a thinned-down die, enabling the vertical interconnect to another die [4, 5]. TSVs are high-density, low-capacity in-terconnects compared to traditional wire-bonds, and hence allow for many more interconnections between stacked dies, while operating at higher speeds and consuming less power [6]. TSV-based 3D technologies en-able the creation of a new generation of ‘super chips’ by opening up new architectural opportunities [7, 8]. 3D-SICs combine a smaller form fac-tor and lower overall manufacturing costs [9] with many other compelling benefits, and hence their technology is quickly gaining ground.

Like all micro-electronic products, 3D-SICs need to be tested for man-ufacturing defects incurred during their many, high-precision, and hence defect-prone manufacturing steps. These tests should be both effective and cost-efficient. Solutions regarding test flow, test contents, and test access need to be developed before 3D-SICs can be brought to the mar-ket. Next to all basic and most advanced test technology issues, 3D-SICs have some unique new test challenges of their own [10, 11]. These chal-lenges include (1) development of new fault models and corresponding tests for TSV-based interconnects and new 3D-induced intra-die defects, (2) wafer probing on small and numerous micro-bumps and/or TSV tips and pads under stringent damage requirements, (3) handling of and prob-ing on wafers with thinned-die stacks, (4) the design, partitionprob-ing, and optimization of DfT architectures that span across multiple dies, and (5) optimization of the test flow for maximum effectiveness and lowest cost. This paper describes a 3D DfT architecture that services the test needs of die maker(s), stack maker, and stack user alike. The architecture is based on a die-level test wrapper that should be included by the various die mak-ers in the designs of the respective dies that together make up the stack. Our 3D DfT architecture supports (1) pre-bond die testing, (2) post-bond

stack testing (of both partial and complete stacks), as well as (3) board-level interconnect testing. Our DfT architecture enables a modular test approach [12], in which the various dies, their embedded IP cores, the inter-die TSV-based interconnects, and the external I/Os can be tested as separate units. This modular test approach provides first-order yield mon-itoring, and allows for flexible inclusion (or exclusion) and scheduling of (re-)tests at the various product stages, for example depending on the ma-turity of the manufacturing process.

The remainder of this paper is organized as follows. Section 2 describes the assumptions and requirements that form the foundation of our 3D DfT architecture. The architecture itself is presented in Section 3; this sec-tion also describes the two alternative variants, based on IEEE Std. 1500 [13, 14] and IEEE Std. 1149.1 [15, 16]. Section 4 details various imple-mentation aspects of the die-level wrapper, and in Section 5 we present experimental results. Section 6 concludes this paper.

2

Assumptions and Requirements

In this paper, we consider 3D-SICs for which all inter-die connections are implemented by means of TSVs and for which all external connec-tions (‘pins’) of the stack are located on one side of one of the extreme tiers (top or bottom). Figure 1 shows three example implementations: (a) wire-bond from the top die, (b) wire-bond from the bottom die, and (c) flip-chip connections from the bottom die. To simplify our descriptions, we assume in the remainder of this paper that all pins are in the bottom die; this assumption is without loss of generality, as we can always swap the references to top and bottom die.

A 3D DfT architecture should service the test needs of die maker(s), stack maker, and stack users alike. The die maker(s) might execute pre-bond tests, covering the intra-die circuitry and possibly also the TSVs [11]. The stack maker might execute post-bond tests on partial and/or complete,

(3)

2

Marinissen, Chi, Verbree, and Konijnenburg

(a) (b) (c)

Figure 1: Three options for 3D-SIC external I/Os: (a) wire-bond from top

die, (b) wire-bond from bottom die, and (c) flip-chip from bottom die.

yet-packaged and/or packaged die stacks; these tests might cover intra-die circuitry (possibly as re-test), as well as the inter-die TSV-based connec-tions [11]. It is assumed that it is a requirement from the stack user that the overall stack product is IEEE 1149.1 [15, 16] compliant on its pins, in order to facilitate board-level interconnect testing.

We assume a 3D-SIC of which the constituting dies are scan testable; for example, this can include scan-tested digital logic, BIST-ed embed-ded memories, or even scan-enabled analog cores. To minimize silicon area, we want to re-use the existing intra-die DfT infrastructure as much as possible: internal scan chains, test control, test data compression cir-cuitry, built-in self-test, etc. We assume that additional external test pins beyond what is required functionally and for IEEE 1149.1 are expensive and hence should be avoided. In contrast, we assume that some addi-tional TSV-based interconnects between tiers for the purpose of test are relatively affordable; e.g., IMEC’s via-middle TSVs are made at a 10µm

minimum pitch [4, 5].

Today’s probe technology is insufficiently precise and damage-free to pro-vide probe access on small micro-bumps, TSV tips, nor TSV landing pads [11]. As long as that is the case, it is a requirement to provide dedicated

probe pads for pre-bond wafer test access [11, 17, 18] for all dies in the

stack, apart from the bottom die.

For the post-bond stack tests, test access is only possible via the external I/Os of the bottom die. This implies that signals for test control and test data exclusively come from and go to the bottom die, and hence have to have a ‘u-turn’ type of shape; we refer to these as TestTurns. Also, in order to reach dies higher up in the stack, the underlying dies need to cooperate in a dedicated mode which requires additional DfT and TSVs which we refer to as TestElevators.

We require the 3D DfT architecture to be scalable in multiple ways. We will equip it with both a fixed one-bit (‘serial’), as well as a scalable multi-bit (‘parallel’) test access mechanism. The focus of the serial mechanism is on debug and diagnosis; it provides a low-cost, low-bandwidth mech-anism for test configuration instructions and test data, which can be used even if the stack product is soldered onto a printed circuit board. The focus of the scalable parallel mechanism is on high-volume production testing; it provides a trade-off between implementation costs and test access band-width. In addition, the architecture should be scalable in the sense that it works for an undetermined number of stack tiers. Also, the architecture should not predestine a die to a certain tier level, such that dies that adhere to the architecture can function at any level in the stack hierarchy.

A final requirement is that the 3D DfT architecture should support a

mod-ular test approach [12, 19], as opposed to an approach in which the

en-tire stack is tested as one monolithic entity. A modular test considers the various dies and TSV-based interconnect layers as separate test units; for complex dies, it is very well possible that they are further sub-divided into multiple finer-grain test modules, e.g., embedded cores. A modular test approach allows to optimize for circuit-specific fault models, enables flexible test flow optimization, and provides a first-order fault diagnosis.

3

3D DfT Architecture

3.1

Architecture Overview

The 3D DfT architecture consists of a set of cooperating die-level test wrappers, one for each die in the stack. A conceptual overview of the architecture is depicted in Figure 2. The figure shows an example stack consisting of three dies. The functional I/Os of the three dies are shown in yellow. At the bottom of bottom Die 1 are the external I/Os (‘pins’). The dies are interconnected by means of functional TSVs. The figure shows in light-blue the conventional, already existing DfT infrastructure. The external I/Os of the stack, all located in the bottom die, are wrapped by IEEE 1149.1 Boundary Scan; this requires a limited number of additional pins, of which two (TDIandTDO) are shown in light-blue. Furthermore, the dies have existing intra-die DfT, exemplified by internal scan chains, Test Data Compression (TDC), Built-In Self Test (BIST), IEEE 1500-compliant core wrappers, and Test Access Mechanisms (TAMs).

Figure 2: Conceptual overview of our 3D DfT architecture.

Shown in light-red is the new 3D DfT, comprised of test wrappers around each die in the stack. The main features of the die-level wrapper are the following: (1) a serial interface for wrapper instructions and low-bandwidth test data and a scalable, parallel interface for higher-low-bandwidth test data, (2) TestTurns in every die, that feed test data back to the pins in the bottom die, (3) a scalable number of dedicated probe pads on all non-bottom dies to enable pre-bond die testing, (4) TestElevators that propa-gate test signals up and down through the stack, and (5) a hierarchical test control mechanism that controls the test mode of each die and optionally opens up to control possible embedded cores within a particular die. Our two proposed 3D die wrappers are based on either one of the exist-ing DfT standards IEEE 1500 and IEEE 1149.1. In the subsequent sub-sections, we describe both alternative architectures in more detail.

(4)

3.2

Die-Level Wrapper Based on IEEE 1500

IEEE Std 1500 standardizes a test wrapper for embedded cores in an SOC [13, 14]. Figure 3(a) shows a conceptual view of an IEEE 1500-compliant wrapper. It has two test access ports. A single-bit (‘serial’) portWSI-WSO

is mandatory and used for both loading wrapper instructions as well as for low-bandwidth test data. An optional, scalable (‘parallel’) portWPI-WPO

can carry higher-bandwidth test data. The combination of a pseudo-static wrapper instruction, shifted into the Wrapper Instruction Register (WIR), and the values on the Wrapper Serial Control (WSC) signals determines the operation of the wrapper. The wrapper has an inward-facing test mode for testing the embedded core itself (‘Intest’), as well as an outward-facing test mode for testing the circuitry external to the embedded core

(‘Ex-test’). In both modes, the Wrapper Boundary Register (WBR) is activated

to apply stimuli and capture responses. The wrapper can also activate its

‘Bypass’ mode, for example to test another core in the SOC.

(a) (b)

Figure 3: IEEE 1500 wrapper: (a) conventional and (b) 3D-enhanced.

Stacked dies in a 3D-SIC can be considered similar to embedded cores in a System-on-Chip (SOC). Consequently, the IEEE 1500 core wrapper can be used and enhanced to form a die-level wrapper for 3D-SICs [20]. Fig-ure 3(b) shows such an 3D-enhanced die wrapper, based on IEEE 1500. The 3D enhancements are highlighted in orange and comprise the follow-ing four items (numbered consistently with Section 3.1).

2. TestTurns: The standard IEEE 1500 interface, consisting ofWSC,

WSI-WSO, andWPI-WPOis located at the bottom side of the die. In the output paths towardWSOandWPO, we insert pipeline registers for a clean timing interface (especially important if many dies are stacked).

3. Probe Pads: All non-bottom dies are equipped with additional probe pads, as long as probe technology does not provide us with solutions to safely probe micro-bumps and/or TSV tips and land-ing pads. These probe pads are mandatory on the serial interface (WSC,WSI-WSO), and optional and scalable on the parallel inter-face (WPI-WPO). If the parallelWPI-WPOinterface coming from the bottom isn bits wide (with n ≥ 0), the corresponding probe

pad interface can bem bits wide, with 0 ≤ m ≤ n.

4. TestElevators: The standard IEEE 1500 interface is copied at the top side of the die, toward higher-up dies. We give these I/Os the same names, post-fixed with the letter “s” (for “stack”).

5. Hierarchical WIR: IEEE 1500 mandates concatenation of core-level WIRs. Following this for die-core-level wrappers, the total WIR chain length would depend on the number of dies in the stack, the number of embedded cores with WIRs per die, and the summed length of the various WIR instructions. To prevent an unbridled

growth of the overall WIR chain length for 3D-SICs, we provision the 3D die-level WIRs with a control bit that allows to bypass the core-level WIRs within that die. This hierarchical WIR mecha-nism, which opens up as needed, similar to a harmonica, is shown in Figure 4. Initially, the WIR chain only consists of the die-level WIRs. Once loaded with die-level instructions, the core-level WIR chain segments are included in the overall WIR chain for only those dies for which the corresponding control bit was set; subsequently, further core-level WIR instructions can be loaded.

Figure 4: Hierarchical WIR chain, which has opened up to include the

core-level WIRs of Dies 2 and 3, which are in one of their Intest modes.

Figure 5 shows the 3D DfT architecture with IEEE 1500-based die wrap-pers for a stack of three dies. TheWSCcontrol signals are broadcast to all dies. The serial and parallel mechanisms are daisychained throughout the stack. The middle die has a wrapper as described above. The die wrap-pers for the top and bottom dies are slightly different. The top die has no die above it, and hence does not implement TestElevators. The bot-tom die contains all external I/Os. Hence, it implements IEEE 1149.1 for board-level interconnect testing. It’s serial interface, consisting ofWSC

andWSI-WSO, is connected to its IEEE 1149.1 TAP controller, as is com-mon in conventional SOCs [14], in order to save dedicated pins. The parallel interfaceWPI-WPOis multiplexed onto existing functional pins. Consequently, the overall 3D DfT architecture does not incur additional stack pins beyond the standard four/five pins interface of IEEE 1149.1.

Figure 5: 3D-SIC DfT architecture for dies based on IEEE 1500.

3.3

Die-Level Wrapper Based on IEEE 1149.1

IEEE Std 1149.1 standardizes a test wrapper for chips on a Printed Cir-cuit Board (PCB) [15, 16]. Figure 6(a) shows a conceptual view of an IEEE 1149.1 compliant wrapper. On purpose, we drew it in the same style as Figure 3. It shows that the IEEE 1500 wrapper and IEEE 1149.1 wrap-per have large commonalities, but there are also a number of significant differences.

• IEEE 1149.1 only has a serial mechanism, and lacks a

(5)

4

Marinissen, Chi, Verbree, and Konijnenburg

• Instead of the six-bit (or optional seven-bit) WSCcontrol port of IEEE 1500, IEEE 1149.1 has a two-bit (or optional three-bit) control port, consisting of the signalsTCK, TMS, and optionally

TRSTN∗. Internally, the additional control signals are generated by

stepping through a 16-state finite state machine named TAP

Con-troller.

(a) (b)

Figure 6: IEEE 1149.1 wrapper: (a) conventional and (b) 3D-enhanced.

Stacked dies in a 3D-SIC can be considered similar to chips on a PCB. Consequently, the IEEE 1149.1 chip wrapper can be used and enhanced to form a die-level wrapper for SICs. Figure 6(b) shows such an 3D-enhanced die wrapper, based on IEEE 1149.1. The 3D enhancements are highlighted in orange and comprise the following four items.

1. Parallel Test Port:

In order to support efficient high-volume testing of the die’s cir-cuitry, a parallel, scalable test port of user-defined widthn is

pro-visioned. We refer to the inputs and outputs of this port as resp.TPI

andTPO.

2. TestTurns:

Similar to the IEEE 1500 die-level wrapper (Section 3.2).

3. Probe Pads:

Similar to the IEEE 1500 die-level wrapper (Section 3.2).

4. TestElevators:

Similar to the IEEE 1500 die-level wrapper (Section 3.2).

The hierarchical WIR is achieved without additional implementation ef-fort. In common SOC implementations, there already exists a hierarchical relationship between a chip-level IEEE 1149.1 Instruction Register (IR) and the core-level IEEE 1500 WIRs.

Figure 7 shows the 3D DfT architecture with IEEE 1149.1-based die wrap-pers for a stack of three dies. The architecture has large similarities to the one based on IEEE-1500 in Figure 5. In fact, the only major dif-ference is in the number and function of the broadcast control signals (six/seven-bitWSCvs. two/three-bitTCK/TMS/TRSTN∗) and the presence

in IEEE 1149.1 of the TAP Controller.

There exist many alternative uses of IEEE 1149.1 beyond board-level in-terconnect testing for purposes like silicon and software debug, emula-tion, in-circuit programming, etc. [21–25]. These applications have a large hardware and software infrastructure, which relies on the presence of the IEEE 1149.1 features. A potential benefit of basing 3D die-level wrap-pers on IEEE 1149.1, as proposed in this section, is that this infrastructure remains operational, also for 3D-SICs.

Figure 7: 3D-SIC DfT architecture for dies based on IEEE 1149.1.

3.4

Operating Modes

3D DfT architectures as described in Sections 3.2 and 3.3 support a num-ber of test modes. Figure 8 shows which combinations of wrapper set-tings can be made by traversing this so-called ‘railroad diagram’ from left to right. In total 16 test modes are possible: four in the pre-bond case, and twelve in the post-bond case. Some examples of operating modes are

SerialPrebondIntestTurn, ParallelPrebondIntestTurn, SerialPostbondBy-passElevator, and ParallelPostbondExtestTurn; an exhaustive list if

pro-vided in Table 1.

Figure 8: ‘Railroad diagram’ for operating mode set-up.

Combining instructions for the various dies in a stack allows us to test one, multiple, or all dies simultaneously, as well as test one, multiple, or all layers of TSV-based interconnects simultaneously. For example, in a four-die stack, it would be possible to simultaneously test the TSV-based interconnects between Dies 2 and 3 and the internal circuitry of Die 4, all through the high-bandwidth parallel port, by assigning the various dies in the stack the following instructions.

• Die 1: ParallelPostbondBypassElevator • Die 2: ParallelPostbondExtestElevator • Die 3: ParallelPostbondExtestElevator • Die 4: ParallelPostbondIntestTurn

4

Implementation Aspects

This section details several implementation aspects of our proposed 3D-enhanced die-level wrappers. Due to lack of space, we describe the 1500-based wrapper only, but the implementation aspects discussed are quite similar for the 1149.1-based wrapper. This section first considers a rela-tively simple case of a die which consists of one (‘flat’) monolithic scan-testable logic design only and a wrapper for which the number of probe pads equals the number of TestElevator TSVs (n = m). Subsequently we

address a more complex case, in which the die is an SOC with top-level logic and embedded cores, and a wrapper for whichn 6= m.

(6)

(a)

(b) ParallelPrebondIntestTurn

(c) SerialPostbondExtestElevator

Figure 9: Implementation of a 3D-enhanced IEEE 1500 wrapper for a flat die.

4.1

3D Wrapper for a Flat Die

Figure 9(a) shows the implementation of a 3D-enhanced wrapper for a flat die. This (simplified) example die only contains flat top-level logic. It has three functional primary inputs (PI[0..2]) and three functional primary outputs (PO[0..2]); some of these functional signals are (to be) connected to the die below this one (at the left-hand side of the figure), others are (to be) connected to the die above this one (at the right-hand side of the fig-ure). In Figure 9(a), these functional I/Os are highlighted by bold orange arrow. The DfT implementation in the die consists of three internal scan chains.

The 3D-enhanced die wrapper is drawn in light-blue, encapsulating the die. The wrapper contains all elements introduced in Section 3: WBR cells (shown in Figure 9(a) as little white rectangles), WSC, WIR, serial portWSI-WSO, serial bypass WBY, parallel portWPI-WPO, parallel by-pass (‘Byby-pass’), extra probe pads, TestElevators, and pipeline registers (‘Reg’). In our example, we have chosen the parallel TestElevator and the parallel probe pad port to be of equal width, viz.n = m = 3.

The wrapper can be reconfigured in various operating modes, as described in Section 3.4. Each operating mode enables a different test access path through the wrapper. Two examples of such operating modes and their corresponding test access paths are shown in Figures 9(b) and 9(c). Fig-ure 9(b) shows the ParallelPrebondIntestTurn mode. This mode is in-tended for a time-efficient high-volume production test of the intra-die cir-cuitry before stacking. The three-bit wide test access path is highlighted in the figure by means of bold red, green, and blue lines. Figure 9(c) shows the SerialPostbondExtestElevator mode. This mode is intended for a low-bandwidth test of the inter-die TSV-based connections after bonding. The single-bit test access path is highlighted in the figure by means of a bold violet line.

Reconfiguration of the wrapper into its various operating modes is done through multiplexers, which are controlled by theWSCcontrol signals and the currently active WIR instruction. In this paper, we assign numbered

names to the wrapper multiplexers: m1, m2, m3, . . .. Multiplexers with

the same name are controlled by the same control signal.

Figure 10 shows commonly used IEEE 1500 WBR cells for respectively a (core or die) input and output [12, 14, 26]. The two wrapper cells are es-sentially equal, apart from their multiplexer control signals: for Intest and

Extest modes, them2 and m3 multiplexers need to be in opposite states.

(a) (b)

Figure 10: A typical IEEE 1500 WBR cell: (a) for inputs and (b) for outputs.

The other multiplexer names are shown in Figure 9. Multiplexers

m4 . . . m7 select among the conventional IEEE 1500 modes, including

Serial/Parallel and Intest/Extest/Bypass. Multiplexerm8 is controlled by

the SelectWIR signal fromWSC and determines whether the serial port

WSI-WSOis used for loading a new instruction into the WIR or for load-ing test data into WBR or WBY.

New for the 3D-enhanced IEEE 1500 wrapper are multiplexersm9 and m10. Multiplexers m9 select as I/Os between the extra probe pads on

the die (pre-bond testing) and the TestElevator TSVs from the die below (post-bond testing). Multiplexersm10 select between the Turn and

Ele-vator operating modes.

Table 1 shows the assignment of all multiplexer control signals for the var-ious operating modes of the wrapper. This table is essentially the output specification of the WIR. The input specification of the WIR is given by the user-defined instruction codes for each of the operating modes.

(7)

6

Marinissen, Chi, Verbree, and Konijnenburg

Mode m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 SerialPrebondBypassTurn x 0 0 x x x 1 0 0 1 SerialPrebondIntestTurn 1/0 1 0 1 1 01 0 0 0 1 ParallelPrebondBypassTurn x 0 0 x x x 0 0 0 1 ParallelPrebondIntestTurn 1/0 1 0 0 1 10 1 0 0 1 SerialPostbondBypassTurn x 0 0 x x x 1 0 1 1 SerialPostbondIntestTurn 1/0 1 0 1 1 01 0 0 1 1 SerialPostbondExtestTurn 1/0 0 1 1 0 01 0 0 1 1 SerialPostbondBypassElevator x 0 0 x x x 1 0 1 0 SerialPostbondIntestElevator 1/0 1 0 1 1 01 0 0 1 0 SerialPostbondExtestElevator 1/0 0 1 1 0 01 0 0 1 0 ParallelPostbondBypassTurn x 0 0 x x x 0 0 1 1 ParallelPostbondIntestTurn 1/0 1 0 0 1 10 1 0 1 1 ParallelPostbondExtestTurn 1/0 0 1 0 0 00 1 0 1 1 ParallelPostbondBypassElevator x 0 0 x x x 0 0 1 0 ParallelPostbondIntestElevator 1/0 1 0 0 1 10 1 0 1 0 ParallelPostbondExtestElevator 1/0 0 1 0 0 00 1 0 1 0

Table 1: Multiplexer control signals for all operating modes.

4.2

3D Wrapper for a Hierarchical Die

In this section, we consider the implementation details for a slightly more complex case, in which (1) the wrapper has different widths for parallel probe pad ports and parallel TestElevator ports (i.e.,n 6= m), and (2) the

die is a core-based SOC with top-level logic and embedded cores. Fig-ure 12(a) shows the implementation of a 3D-enhanced wrapper for this case. The figure is in the same style as Figure 9; the differences required to support (1) and (2) are highlighted by means of purple and dark-green outlines, respectively.

In this example, our 3D-enhanced wrapper has different pre-bond and post-bond parallel port widths, viz. n = 3 and m = 2. As shown in

Figure 12(a) by means of purple outlines, this requires two extram9

mul-tiplexers as well as two new mulmul-tiplexersm13 and m14 to switch between

pre-bond parallel test modes (withm = 2) and post-bond parallel test

modes (withn = 3).

The example die has one embedded core, named Core 1; in our simplified example, the single Core 1 actually represents a possibly larger number of embedded cores. Core 1 is wrapped with a conventional IEEE 1500 wrapper (not shown) with a parallel portWPI-WPOof three bits wide. The

example TAM architecture in our example SOC is a Daisychain Architec-ture [19, 26].

The internal scan chains in the die’s top-level logic, which embeds Core 1, now need to be equipped with local serial and parallel bypasses; these by-passes would become active in case we want to test Core 1 on its own, i.e., without testing the die’s top-level logic. In Figure 12 these bypasses are shown with multiplexerm11. They need to be controlled from a WIR bit.

Instead of adding a single-bit WIR in the die’s top-level logic, we have opted for extending the die-level WIR with this one extra bit.

As this example die contains an embedded core, it implements the hier-archical WIR feature described in Section 3.2. Figure 11 details the im-plementation of this feature. All die-levelWSCsignals are passed on to WIR1 of Core 1, apart from the signalWRSTN, which isAND-gated with

C WIR EN. This ensures that WIR1 of Core 1 is kept in its (functional) re-set state, until it is enabled. As a response to appropriate instructions, the die-level WIR asserts a pseudo-static test control signalC WIR EN that indicates when the core-level WIRs should be enabled. When WIR1 is enabled, multiplexerm12 extends the WIR chain to include WIR1 in it.

Figure 11: Implementation of the hierarchical WIR control mechanism.

With the hierarchical WIR set-up, we distinguish three types of operations: aWRSTNreset, followed by resp. zero, one, or two instructions loads, as shown in Figure 13. A singleWRSTNreset is sufficient to jump-start all WIRs into their functional (non-test) mode [14]. The WIR chain is reset

(a)

(b) ParallelPrebondIntestTurn – test Top-Level Logic

(c) ParallelPostbondIntestElevator – test Core 1

(8)

Figure 13: WIR instruction sequence: (1) reset, (2) die-level WIR

configu-ration, (3) die- and core-level WIR configuration.

to its shortest length, through the die-level WIRs only. To enter a test mode, it is sufficient to subsequently load the appropriate instructions in all die-level WIRs. If we want to enable one or more core-level WIRs, the corresponding die-level WIR instructions need to assert theirC WIR EN

signals. The hierarchical WIR chain will then be reconfigured to include the core-level WIRs of the corresponding dies. Subsequently, the longer WIR chain will need to be reprogrammed with instructions for all die-level WIRs and the selected core-die-level WIRs. Note that one can flexibly re-order tests without explicitly keeping track of the WIR chain length in the previous test, provided all tests start with a WIR chain reset pre-amble. For this example die, Figures 12(b) and 12(c) show two operating mode examples and their corresponding test access paths. Figure 12(b) shows a mode in which the die’s top-level logic is tested. The die wrapper is in its ParallelPrebondIntestTurn mode. Note that this test requires the IEEE 1500 wrapper of embedded Core 1 to participate in its

ParallelEx-test (WP EXTEST) mode, as the inputs and outputs of Core 1 actually are outputs resp. inputs of the die’s top-level logic. Also note that, although the die and its embedded core support a test path width of three bits, in this pre-bond test mode only two input and output pads are provided (m = 2).

Consequently, we are forced to assign the three internal test paths to two external pads, as highlighted in the figure by means of bold red and blue lines.

Figure 12(c) shows a mode in which Core 1 is tested. The die wrapper is in its ParallelPostbondIntestElevator mode. The die’s top-level logic is bypassed, and Core 1 is in its ParallelIntest (WP INTEST) mode. This example is a post-bond test mode (n = 3), and the test data paths are

highlighted by means of bold red, green, and blue lines.

5

Experimental Results

The implementation costs for the 3D die wrapper are threefold: (1) ad-ditional probe pads, (2) adad-ditional TSVs, and (3) adad-ditional logic gates. For the IEEE 1500-based wrapper, the additional probe pad count is

6 + 2 + 2m (with m ≥ 0) for respectively theWSC, WSI-WSO serial port, andWPI-WPOparallel port; for the IEEE 1149.1-based wrapper, this number changes to2+2+2m. These numbers exclude pads and TSVs for

infrastructure like power, ground, and clocks. For the IEEE 1500-based wrapper, the additional TSV count is6 + 2 + 2n (with n ≥ 0); for the

IEEE 1149.1-based wrapper, this number changes to2 + 2 + 2n.

The area costs of the additional logic gates consist of three components.

• A fixed cost, including WIR, WBY, and some of the configuration

multiplexers.

• A variable cost that scales linearly with the number of functional

I/Os of the die, including the WBR cells.

• A variable cost that scales linearly with the number of die-internal

scan chains, including the multiplexers for scan chain concatena-tion.

The area cost can be estimated with the following equation:

Aw= Fcost+ (#IO × IOcost) + (#SC × SCcost) (5.1)

whereFcost,IOcost, andSCcostare fixed areas, and#IO and #SC are

circuit-dependent parameters, representing the number of I/Os and inter-nal scan chains respectively.

In order to verify our proposed 3D-enhanced wrapper design and assess its implementation costs, we have set up a prototype tool flow that adds a 3D wrapper to a die design. The tool flow starts with the gate-level netlist of a die design, including its conventional internal DfT features. Subsequently, we use a commercial EDA tool to add a conventional test wrapper to the die. We manually modify the 2D wrapper into a 3D-enhanced wrapper, as there is no commercial tool support for that available yet. Next, we are able to assess the impact on the design size by reporting the gate area costs. Finally, we verify our design by generating test patterns with a commercial ATPG tool and simulating the resulting test sets.

We have applied the tool flow described above to three ISCAS’89 bench-mark circuits: s400, s1423, and s5378 [27], posing as to-be-wrapped dies. Our experimental results for these three circuits are presented in the top-three rows of Table 2. Column 1 lists the circuit name. Columns 2 to 5 give several circuit characteristics: number of standard cells, number of flip-flops, number of inputs, and number of outputs. We mapped these three circuits to the Faraday/UMC 90nm CMOS standard cell library, and the resulting area is presented in Column 6. Note that these three circuits are very small and not representative for dies to be stacked in commercial 3D-SICs!

Figure 14 shows the schematic view of the 3D-wrapped circuit s1423. In the Faraday/UMC 90nm CMOS library,Fcost = 432µm

2

,IOcost =

36µm2

, andSCcost= 63µm 2

. The variable wrapper area cost parameters for the three circuits are given in Columns 7 and 8 of Table 2. Column 9 presented the estimated wrapper area costs according to Equation (5.1), while Column 10 gives the actual gate area costs as reported by our EDA tool; both are almost equal, indicating that Equation (5.1) is a good esti-mator.

(9)

8

Marinissen, Chi, Verbree, and Konijnenburg

Die Specification Parameters Wrapper AreaAw

Circuit #Cells #FFs #Inputs #Outputs AreaA #IO #SC Estimated Actual Overhead

Name (µm2 ) (µm2 ) (µm2 ) (Aw/A) s400 [27] 186 21 3 6 1,044 9 3 945 942 +90.517% s1423 [27] 734 74 17 5 3,748 22 3 1,413 1,411 +37.700% s5378 [27] 2,961 179 35 49 11,751 84 3 3,645 3,645 +31.019% PNX8550 [28] 10M 338,859 ? ? ∼50M ∼280 140 19,332 n.a. +0.039%

Table 2: Experimental results regarding area costs of 3D-enhanced wrappers.

Finally, Column 11 of Table 2 presents the relative overhead of the die wrapper compared to the original die area. These percentages are large, but one has to take into account that they are distorted by the fact that the ISCAS’89 benchmarks under consideration are very small compared to industrial 3D-SIC dies. The overhead percentages already show a rapid decline with the increasing die size of the three ISCAS’89 circuits. To demonstrate that a 3D-enhanced wrapper brings negligible additional area costs to an industrial SOC, we applied our estimator equation to published data for a commercial SOC, PNX8550 [28]. The last row in Table 2 shows that with 0.039%, the 3D wrapper costs are truly negligible.

6

Conclusion

In this paper, we presented a generic Design-for-Test architecture for TSV-based 3D-SICs. The main component of our 3D DfT architecture is a die-level wrapper. The paper describes two alternative wrappers, one based on an extended version of IEEE 1500, the other based on an extended ver-sion of IEEE 1149.1. Both wrappers have the following key features: (1) a serial (one-bit) and scalable parallel (n-bit) test access mechanism, (2)

TestTurns from and to the stack’s external I/Os (typically located in the

bottom die), (3) additional probe pads for all non-bottom dies allowing for pre-bond testing, (4) TestElevators that carry test data up and down through the stack in post-bond testing, and (5) a hierarchical (W)IR chain that prevents unbridled growth of the test instruction sequences. The main difference between the IEEE 1500- and IEEE 1149.1-based die wrappers is in the width of the broadcast control busses (six or seven vs. two or three wires), the on-die TAP Controller (absent vs. present), and the support for existing debug and emulation set-ups (absent vs. present).

The architecture leverages existing intra-die DfT features such as internal scan, test data compression, built-in self-test, and core-based wrappers and TAMs, as well as boundary scan at the 3D-SIC’s PCB interface, and re-quires no additional product-level pins. The architecture services the test needs for die maker(s), stack maker, and stack user alike, by providing support for (1) pre-bond die testing, (2) post-bond stack testing (of both partial and complete stacks), and (3) board-level interconnect testing. The architecture supports a modular test approach, in which dies and their em-bedded cores, as well as inter-die interconnects, can be tested separately. The architecture provides maximum freedom w.r.t. inclusion or exclusion of certain tests at a particular stage of the test flow and allows for flexible (re-)scheduling of those tests, in order to optimize the test flow and min-imize the associated test costs. We have shown that the implementation costs for medium and large industrial SOCs are negligible.

The proposed architecture is structured, as it provides a common DfT tem-plate that meets all 3D-SIC test access requirements. The proposed archi-tecture is also scalable, in the sense that it works for all stacks heights and provides user-defined test access bandwidth; the latter provides a trade-off opportunity between silicon area and test length. Consequently, the archi-tecture is a great starting point for future standardization and automation in EDA tool flows for DfT insertion and test expansion.

References

[1] Robert S. Patti. Three-Dimensional Integrated Circuits and the Future of System-on-Chip Designs.

Proceed-ings of the IEEE, 94(6):1214–1224, June 2006.

[2] Eric Beyne and Bart Swinnen. 3D System Integration Technologies. In Proceedings of IEEE International

Conference on Integrated Circuit Design and Technology (ICICDT), pages 1–3, June 2007.

[3] Philip Garrou, Christopher Bower, and Peter Ramm, editors. Handbook of 3D Integration – Technology and

Applications of 3D Integrated Circuits. Wiley-VCH, Weinheim, Germany, August 2008.

[4] Bart Swinnen et al. 3D Integration by Cu-Cu Thermo-Compression Bonding of Extremely Thinned Bulk-Si Die Containing 10µm Pitch Through-Si Vias. In Proceedings IEEE International Electron Devices Meeting

(IEDM), pages 1–4, May 2006.

[5] Jan Van Olmen et al. 3D Stacked IC Demonstration using a Through Silicon Via First Approach. In

Proceed-ings IEEE International Electron Devices Meeting (IEDM), pages 1–4, December 2008.

[6] Kaustav Banerjee et al. 3-D ICs: A Novel Chip Design for Improving Deep-Submicrometer Interconnect Performance and Systems-on-Chip Integration. Proceedings of the IEEE, 89(5):602–633, May 2001. [7] Gabriel H. Loh, Yuan Xie, and Bryan Black. Processor Design in 3D Die-Stacking Technologies. IEEE Micro,

27(3):31–48, May/June 2007.

[8] Roshan Weerasekera et al. Extending Systems-on-Chip to the Third Dimension: Performance, Cost and Tech-nological Tradeoffs. In Proceedings International Conference on Computer-Aided Design (ICCAD), pages 212–219, November 2007.

[9] Dimitrios Velenis et al. Impact of 3D Design Choices on Manufacturing Cost. In Proceedings IEEE

Interna-tional Conference on 3D System Integration (3DIC), September 2009.

[10] Hsien-Hsin S. Lee and Krishnendu Chakrabarty. Test Challenges for 3D Integrated Circuits. IEEE Design &

Test of Computers, 26(5):26–35, September/October 2009.

[11] Erik Jan Marinissen and Yervant Zorian. Testing 3D Chips Containing Through-Silicon Vias. In Proceedings

IEEE International Test Conference (ITC), November 2009. Paper ET1.1.

[12] Erik Jan Marinissen and Yervant Zorian. IEEE 1500 Enables Modular SOC Testing. IEEE Design & Test of

Computers, 26(1):8–16, January/February 2009.

[13] IEEE Computer Society. IEEE Std 1500TM-2005, IEEE Standard Testability Method for Embedded Core-based Integrated Circuits. IEEE, New York, NY, USA, August 2005.

[14] Francisco da Silva, Teresa McLaurin, and Tom Waayers. The Core Test Wrapper Handbook – Rationale and

Application of IEEE Std. 1500TM, volume 35 of Frontiers in Electronics Testing. Springer-Verlag, Boston, MA, USA, 2006.

[15] IEEE Computer Society. IEEE Std 1149.1TM-2001, IEEE Standard Test Access Port and Boundary-Scan Architecture. IEEE, New York, NY, USA, June 2001.

[16] Kenneth P. Parker. The Boundary-Scan Handbook. Springer-Verlag, third edition, June 2003.

[17] Dean L. Lewis and Hsien-Hsin S. Lee. A Scan-Island Based Design Enabling Prebond Testability in Die-Stacked Microprocessors. In Proceedings IEEE International Test Conference (ITC), October 2007. Paper 21.2.

[18] Li Jiang et al. Layout-Driven Architecture Design and Optimization for 3D SoCs under Pre-Bond Test-Pin-Count Constraint. In Proceedings International Conference on Computer-Aided Design (ICCAD), pages 191–196, November 2009.

[19] Sandeep Kumar Goel and Erik Jan Marinissen. SOC Test Architecture Design for Efficient Utilization of Test Bandwidth. ACM Transactions on Design Automation of Electronic Systems, 8(4):399–429, October 2003. [20] Erik Jan Marinissen, Jouke Verbree, and Mario Konijnenburg. A Structured and Scalable Test Access

Archi-tecture for TSV-Based 3D Stacked ICs. In Proceedings IEEE VLSI Test Symposium (VTS), pages 269–274, April 2010.

[21] Mike Winters. Using IEEE-1149.1 for In-Circuit Emulation. In WESCON ’Idea/Microelectronics’ Conference, pages 525–528, September 1994.

[22] Dae-Young Jung and Sung-Ho Kwak and Moon-Key Lee. Reusable Embedded Debugger for 32-bit RISC Processor Using the JTAG Boundary Scan Architecture. In IEEE Asia-Pacific Conference on ASIC, pages 209–212, August 2002.

[23] IEEE Computer Society. IEEE Std 1532TM-2002, IEEE Standard for In-System Configuration of Pro-grammable Devices. IEEE, New York, NY, USA, January 2003.

[24] Ken Posse and others. IEEE P1687: Toward Standardized Access of Embedded Instrumentation. In

Proceed-ings IEEE International Test Conference (ITC), pages 1–8, October 2006.

[25] Bart Vermeulen and others. Communication-Centric SoC Debug Using Transactions. pages 69–76, May 2007. [26] Tom Waayers, Richard Morren, and Roberto Grandi. Definition of a Robust Modular SOC Test Architec-ture; Resurrection of the Single TAM Daisy-Chain. In Proceedings IEEE International Test Conference (ITC), Austin, TX, USA, November 2005.

[27] Franc Brglez and David Bryan and Krzysztof Ko´zmi´nski. Combinational Profiles of Sequential Benchmark Circuits. In Proceedings International Symposium on Circuits and Systems (ISCAS), pages 1924–1934, May 1989.

[28] Sandeep Kumar Goel et al. Test Infrastructure Design for the NexperiaTMHome Platform PNX8550 Sys-tem Chip. In Proceedings Design, Automation, and Test in Europe (DATE) Designers Forum, pages 108–113, Paris, France, February 2004.

Referenties

GERELATEERDE DOCUMENTEN

The fifth category of Internet-related homicides consisted of relatively rare cases in which Internet activity, in the form of online posts or messages on social media

Volgens de vermelding in een akte uit 1304, waarbij hertog Jan 11, hertog van Brabant, zijn huis afstaat aan de kluizenaar Johannes de Busco, neemt op dat ogenblik de

n Yellow: gene was expressed both in test and in control sample. n Black: gene was neither expressed in test nor in

De- compositions such as the multilinear singular value decompo- sition (MLSVD) and tensor trains (TT) are often used to com- press data or to find dominant subspaces, while

0 2 1 0 2 0 0 1 2 0 1 1 0 1 0 2 2 0 0 1 1 0 1 0 1 0 1 Planning phase Portfolio management Proficient Portfolio management Insufficient portfolio management

The atomic structure at the interface of a 180 -rotated single-crystal NiSi2 film on Si(111) has been determined by a new thin-film ion-channeling method, using ultrahigh

It adds a die-level wrapper, which is based on IEEE 1500, with the following novel features: (1) dedicated probe pads on the non-bottom dies to facilitate pre-bond die testing,

hand to a mouse, this has two reasons: The hand tracking and pose estimation are not ready at the time of the experiments and the user can be biased when using his hand for