• No results found

Transformations for polyhedral process networks Meijer, S.

N/A
N/A
Protected

Academic year: 2021

Share "Transformations for polyhedral process networks Meijer, S."

Copied!
11
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Transformations for polyhedral process networks

Meijer, S.

Citation

Meijer, S. (2010, December 8). Transformations for polyhedral process networks. Retrieved from https://hdl.handle.net/1887/16221

Version: Corrected Publisher’s Version

License: Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden

Downloaded from: https://hdl.handle.net/1887/16221

Note: To cite this publication please use the final published version (if applicable).

(2)

Transformations for Polyhedral Process Networks

Sjoerd Meijer

(3)
(4)

Transformations for Polyhedral Process Networks

Proefschrift

ter verkrijging van

de graad van Doctor aan de Universiteit Leiden, op gezag van Rector Magnificus prof.mr. P.F. van der Heijden,

volgens besluit van het College voor Promoties te verdedigen op woensdag 8 december 2010

klokke 16:15 uur door Sjoerd Meijer geboren te Leiderdorp

in 1979.

(5)

Samenstelling promotiecommissie:

promotor Prof.dr. Ed F. Deprettere Universiteit Leiden co-promotor Dr. Todor Stefanov Universiteit Leiden overige leden: Prof.dr. Harry Wijshoff Universiteit Leiden Prof.dr. Joost Kok Universiteit Leiden

Prof. Dr.-Ing. J¨urgen Teich Universit¨at Erlangen-N¨urnberg Prof.dr. Gerard Smit Universiteit Twente

Prof.dr. Henk Corporaal Technische Universiteit Eindhoven

Transformations for Polyhedral Process Networks Sjoerd Meijer. -

Thesis Universiteit Leiden. - With index, ref. - With summary in Dutch ISBN 978-90-9025792-1

Copyright c 2010 by Sjoerd Meijer, Leiden, The Netherlands.

Cover design by Senny Yu.

All rights reserved. No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, in- cluding photocopying, recording or by any information storage and retrieval system, without permission from the author.

Printed in the Netherlands

(6)
(7)

vi

(8)

Contents

1 Introduction 1

1.1 Problem Statement . . . 5

1.2 Contributions . . . 7

1.3 Related Work . . . 9

1.4 Outline . . . 15

2 Background 17 2.1 Polyhedra . . . 17

2.2 Lexicographic Order . . . 19

2.3 Static Affine Nested-Loop Programs . . . 21

2.4 Extracting the Polyhedral Model from SANLPs . . . 23

2.5 Polyhedral Process Networks . . . 24

2.6 Validity of Transformations . . . 29

3 Process Splitting Transformations 31 3.1 Process Splitting: Definitions, Notations, and Examples . . . 32

3.2 Challenges of Applying the Process Splitting Transformation . . . . 35

3.3 Partitioning Metrics . . . 38

3.3.1 Computation and Communication Costs . . . 38

3.3.2 Initial Delay . . . 39

3.3.3 Production Period . . . 40

3.3.4 Data Transfers . . . 42

3.3.5 Additional Control Overhead . . . 42

3.4 Compile-time Selection of Splitting Transformation . . . 43

3.5 Case-Studies . . . 50

3.5.1 Single Diagonal Dependence . . . 51

3.5.2 Matrix Multiplication with Multiple Dependencies . . . 56

(9)

viii Contents

3.5.3 Four Producers with Delays . . . 59

3.6 Discussion and Summary . . . 62

4 Process Merging Transformations 65 4.1 Process Merging: Definitions . . . 65

4.2 Challenges of Applying the Process Merging Transformation . . . . 66

4.3 Restrictions on the Throughput Modeling . . . 69

4.4 Throughput Modeling . . . 70

4.4.1 Process Throughput and Throughput Propagation . . . 70

4.4.2 Isolated Throughput of a (Compound) Process . . . 72

4.4.3 FIFO Channel Throughput . . . 74

4.4.4 Aggregated FIFO Throughput . . . 75

4.4.5 System Throughput Calculation Algorithm . . . 77

4.5 Case-Studies . . . 78

4.5.1 Merging Light-Weight Producers . . . 78

4.5.2 Merging Processes in Networks with Different Data Paths . 81 4.6 Discussion and Summary . . . 82

5 Appling Transformations in Combination 85 5.1 Impact of the Transformation on Performance Results . . . 87

5.1.1 Transforming a PPN to Create More Processes . . . 87

5.1.2 Transforming a PPN to Reduce the Number of Processes . . 89

5.1.3 The Optimization Pitfall: Performance Degradation . . . 90

5.2 Compile-Time Solution for Transformation Ordering . . . 91

5.2.1 Creating Load-Balanced Tasks . . . 93

5.2.2 Selecting Processes for Transformations . . . 94

5.3 Exploiting Data-Level Parallelism . . . 95

5.3.1 Stateful Processes . . . 97

5.3.2 Cycles . . . 97

5.4 Case-Studies . . . 99

5.4.1 QR Decomposition: a PPN with Stateful Processes and Cycles 100 5.4.2 Transforming Perfectly Balanced PPNs . . . 102

5.5 Discussion and Summary . . . 105

6 Executing PPNs on Fixed Programmable MPSoC Platforms 111 6.1 The Programmable Platforms . . . 112

6.2 Realizing FIFO Communication . . . 114

6.3 Performance Results . . . 118

6.4 Discussion and Summary . . . 123

7 Conclusions 125

(10)

Contents ix

Bibliography 130

Index 140

Acknowledgments 143

Samenvatting 145

Curriculum Vitae 147

(11)

Referenties

GERELATEERDE DOCUMENTEN

The third is a mapping specification describing how the processes of the PPN are as- signed to the processing elements of the hardware platform. The ESPAM tool takes these

It can be seen that process P0 is a source process because it does not read data from other processes, and that process P2 is a sink process because it does not write data to

Note that in this example, the first iterations of the second partition for the diagonal plane-cut and unfolding on the outermost loop i are the same, i.e., iteration (1, 0), but

Then we increase the workload of the producer processes and intentionally create a compound process that is the most compute intensive process. We check if this is captured by

Before introducing our solution in a more formal way, we show how our approach intuitively works for the examples discussed in Section 5.1. We have already shown 3 different

The first two classes of FIFO channels are easy to implement efficiently, as FIFOs from these classes are realized using just local (for producer and consumer processes) memories

• Conclusion II: by first splitting up all processes and by subsequently merg- ing the different process instances into load-balanced compound processes, we solved the problem

In RTCSA ’06: Proceedings of the 12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, pages 207–214, 2006..