Koranne | Practical Computing on the Cell Broadband Engine | E-Book | www.sack.de
E-Book

E-Book, Englisch, 485 Seiten

Koranne Practical Computing on the Cell Broadband Engine


1. Auflage 2009
ISBN: 978-1-4419-0308-2
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark

E-Book, Englisch, 485 Seiten

ISBN: 978-1-4419-0308-2
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark



Practical Programming in the Cell Broadband Engine offers a unique programming guide for the Cell Broadband Engine, demonstrating a large number of real-life programs to identify and solve problems in engineering, logic design, VLSI CAD, number-theory, graph-theory, computational geometry, image processing, and other subjects. Key features include: Numerous diagrams, mnemonics, tables, charts, code samples for making program development on the CBE as accessible as possible Comprehensive reading list for introductory material to the subject matter A website providing all source codes and sample-data for examples presented in this text.

Koranne Practical Computing on the Cell Broadband Engine jetzt bestellen!

Autoren/Hrsg.


Weitere Infos & Material


1;Preface;6
2;Accompanying Website for this Book;12
3;Contents;13
4;List of Tables;21
5;List of Figures;23
6;Listings;27
7;Acronyms;31
8;Part I Introducing the Cell Broadband Engine;33
8.1;Chapter 1 Introduction;35
8.1.1;1.1 About this book;35
8.1.2;1.2 Background of the CBE Architecture and Processor;37
8.1.3;1.3 Design of the Cell Architecture;41
8.1.4;1.4 The Cell as a supercomputer;43
8.1.5;1.5 Major Components of the Cell Broadband Engine;45
8.1.6;1.6 Flex I/O, DDR/XDR, Interrupt Controller;46
8.1.7;1.7 Conclusions;46
8.2;Chapter 2 The Power Processing Element (PPE);48
8.2.1;2.1 PPE: the Control Plane of the CBEA;48
8.2.2;2.2 PPE Problem State Registers;50
8.2.3;2.3 Memory arrays in the CBEA;52
8.2.4;2.4 Viewing the PPE as a dual-core processor: EMT specifics;54
8.2.5;2.5 PowerPC Instruction Set;57
8.2.6;2.6 SPE Memory Management and EA Aliasing;57
8.2.7;2.7 Power Processor Element (PPE) Cache Utilization;58
8.2.8;2.8 Understanding the PPE Pipeline;60
8.2.9;2.9 Measuring and Profiling Application Performance;64
8.2.10;2.10 Conclusion;65
8.3;Chapter 3 The Synergistic Processing Element;66
8.3.1;3.1 Introduction to Cell SPE;66
8.3.2;3.2 SPE Local Store;69
8.3.3;3.3 SPU LS Priority;70
8.3.4;3.4 GPRs in the SPE;71
8.3.5;3.5 SPE Intrinsics and Instruction;71
8.3.6;3.6 SPU Instruction Set: Pipelines, Execution Units and Latency;72
8.3.7;3.7 Load/Store Optimization on SPU;75
8.3.8;3.8 Branch Optimization on SPU;78
8.3.9;3.9 Instruction Scheduling;80
8.3.10;3.10 Memory flow;80
8.3.11;3.11 Mailbox Facility;81
8.3.12;3.12 Use SPU Intrinsics with Mailboxes;85
8.3.13;3.13 Synergistic Processor Unit Channels and Memory Flow Controller;87
8.3.14;3.14 SPU Timer and Events;89
8.3.15;3.15 SPE Context on the PPE;89
8.3.16;3.16 Conclusion;89
8.3.17;3.17 What we have not discussed;90
8.4;Chapter 4 Element Interconnect Bus;91
8.4.1;4.1 Introduction to the Element Interconnect Bus;91
8.4.2;4.2 Important things to remember about the EIB;95
8.4.3;4.3 Conclusions;96
8.5;Chapter 5 Direct Memory Access (DMA);97
8.5.1;5.1 Introduction to DMA in the Cell Broadband Engine;97
8.5.2;5.2 Conclusions;101
9;Part II Programming the Cell Broadband Engine;102
9.1;Chapter 6 Foundations for Program Development on CBE;104
9.1.1;6.1 Theory of Parallel Programming;104
9.1.2;6.2 The Class NC of complexity analysis;106
9.1.3;6.3 Parallel programming concepts;107
9.1.4;6.4 Introduction to pthreads;108
9.1.5;6.5 Multi-faceted parallelism in the CBE Architecture;113
9.1.6;6.6 Alignment using compiler attributes;113
9.1.7;6.7 OpenMP;115
9.1.8;6.8 Double buffering example;117
9.1.9;6.9 Software Managed Cache;117
9.1.10;6.10 Physical Organization;118
9.1.11;6.11 Things to watch out for;118
9.1.12;6.12 Strategies and Paradigms for Identifying Parallelism;120
9.1.13;6.13 Conclusion;121
9.2;Chapter 7 The development environment;122
9.2.1;7.1 How to setup an PS3 for Cell development;122
9.2.2;7.2 Alternatives to installing GNU/Linux on the PS3;123
9.2.3;7.3 The CESOF Format;125
9.2.4;7.4 Basic development environment;127
9.2.5;7.5 Debugging code on the SPU;128
9.2.6;7.6 Conclusions;130
9.3;Chapter 8 Hello World;131
9.3.1;8.1 Writing simple programs on the Cell Broadband Engine;131
9.3.2;8.2 SPE Program Development;134
9.3.3;8.3 Using pthread with SPU tasks;136
9.3.4;8.4 Developing applications using libspe2;137
9.3.5;8.5 Measuring Performance using High Resolution Timers;149
9.3.6;8.6 Synchronization;151
9.3.7;8.7 Conclusions;159
9.4;Chapter 9 An Overview of the SDK;160
9.4.1;9.1 Major components of the Cell Broadband Engine SDK 3.0;160
9.4.2;9.2 Application programming libraries in the SDK;161
9.4.3;9.3 SPE Runtime Library;161
9.4.4;9.4 SPU Timer Library;162
9.4.5;9.5 Monte-Carlo Library;162
9.4.6;9.6 BLAS Library;163
9.4.7;9.7 LAPACK Library;164
9.4.8;9.8 SIMD Math API and Library;165
9.4.9;9.9 SPU Software Managed Cache;165
9.4.10;9.10 FFT Library;165
9.4.11;9.11 Game Math Library;165
9.4.12;9.12 Image Library;166
9.4.13;9.13 Large Matrix Library;166
9.4.14;9.14 Matrix Library;166
9.4.15;9.15 Synchronization Library;167
9.4.16;9.16 Vector Library;168
9.4.17;9.17 Multiprecision Math Library;168
9.4.18;9.18 ALF Library and Framework;169
9.4.19;9.19 DACS Library and Framework;170
9.4.20;9.20 Conclusions;170
9.5;Chapter 10 Basic Algorithms;171
9.5.1;10.1 Some number theoretic problems;171
9.5.2;10.2 Merit Factor of Sequences and Auto-correlation;174
9.5.3;10.3 Computing Greatest Common Divisor (GCD);181
9.5.4;10.4 Sorting Algorithms;183
9.5.5;10.5 Table lookup;189
9.5.6;10.6 (Perfect) Hash Function;190
9.5.7;10.7 Minimal Perfect Hashing;192
9.5.8;10.8 Implementing minimal perfect hash in the SPU;194
9.5.9;10.9 Using perfect hash functions with large graph clusters;195
9.5.10;10.10 Heap Data Structure;195
9.5.11;10.11 Disjoint Set Union-Find Algorithms;196
9.5.12;10.12 Computing Zech logarithm using SPE in batch mode;198
9.5.13;10.13 Conclusions;201
9.6;Chapter 11 Graph Theory on the CBEA;202
9.6.1;11.1 Introduction;202
9.6.2;11.2 Definitions;202
9.6.3;11.3 Graph representation;203
9.6.4;11.4 Minimum Spanning Tree using Kruskal’s Method;206
9.6.5;11.5 Revisiting min-cut using MST and Karger’s Method;213
9.6.6;11.6 Dijkstra’s Algorithms for Shortest Paths;214
9.6.7;11.7 Floyd-Warshall All-pairs shortest-path;217
9.6.8;11.8 Formal definition of APSP (all-pairs shortest-path);217
9.6.9;11.9 Maximal Independent Set;219
9.6.10;11.10 Approximation algorithm for maximal matching;219
9.6.11;11.11 Partitioning using Simulated Annealing;222
9.6.12;11.12 Simulated Annealing;225
9.6.13;11.13 Partitioning using Spectral Methods;232
9.6.14;11.14 On-line streaming algorithm for graph properties;237
9.6.15;11.15 Conclusion;246
9.7;Chapter 12 Alternative methods for parallel programming on SPE;247
9.7.1;12.1 Introduction;247
9.7.2;12.2 NESL: Nested Data Parallel Language;247
9.7.3;12.3 CVL and VCODE: Vector Libraries;250
9.7.4;12.4 A Simplified SPU ISA Simulator;252
9.7.5;12.5 VCODE Compiler;259
9.7.6;12.6 PowerList: Data structure for parallel programming;261
9.7.7;12.8 Tale of Dwarfs;265
9.7.8;12.9 Introduction to Cilk++;266
9.7.9;12.10 Introduction to Intel Ct;267
9.7.10;12.11 Introduction to RapidMind;267
9.8;Chapter 13 Computational Mathematics on the CBEA;269
9.8.1;13.1 Introduction;269
9.8.2;13.2 Calculating PI;269
9.8.3;13.3 Line fitting;271
9.8.4;13.4 Representing Monomials and Polynomials;275
9.8.5;13.5 Evaluating polynomials;279
9.8.6;13.6 SIMD Polynomial evaluation with fixed step;280
9.8.7;13.7 Parallel Polynomial Evaluation;280
9.8.8;13.8 Evaluating Boolean Functions;283
9.8.9;13.9 Parallel Matrix Functions;283
9.8.10;13.10 Solving Equations in a Single Complex Variable;283
9.8.11;13.11 Conclusions;309
9.9;Chapter 14 Vector Graphics on SPU;310
9.9.1;14.1 Introduction;310
9.9.2;14.2 Basic Computational Geometry on SPU;310
9.9.3;14.3 Conclusion;322
9.10;Chapter 15 Optimizing SPU Programs;323
9.10.1;15.1 Introduction to Program Optimization for Cell;323
9.10.2;15.2 GNU gprof;323
9.10.3;15.3 Branch probability extraction;324
9.10.4;15.4 SPU Assembly Analysis;325
9.10.5;15.5 FDPRPRO: Optimization Tool;328
9.10.6;15.6 Conclusion;330
10;Part III Case Studies;331
10.1;Chapter 16 Line-of-sight Computation;333
10.1.1;16.1 Introduction to the line-of-sight computation;333
10.1.2;16.2 Basic equation for LOS calculation;336
10.1.3;16.3 Watershed analysis using LOS;338
10.1.4;16.4 LOS formulation using Bresenham’s line-drawing for interpolation;338
10.1.5;16.5 SIMD Implementation of line-of-sight algorithm;341
10.1.6;16.6 Watershed Analysis;342
10.1.7;16.7 SPU Implementation;343
10.1.8;16.8 Conclusion;349
10.2;Chapter 17 Structure Determination using PDF;350
10.2.1;17.1 Problem of nano-structure determination;350
10.2.2;17.2 Implementation of the ab-initio PDF method;351
10.2.3;17.3 Algorithms and data structures required for the problem;353
10.2.4;17.5 Conclusions;368
10.3;Chapter 18 Polytope Enumeration;369
10.3.1;18.1 What is a polytope ?;369
10.3.2;18.2 Examples of polytopes;372
10.3.3;18.4 Experimental Results;383
10.3.4;18.5 Future Work;385
10.3.5;18.6 Current Results;385
10.3.6;18.7 Conclusion;386
10.4;Chapter 19 Synthetic Aperture Radar;387
10.4.1;19.1 Introduction;387
10.4.2;19.2 Fundamental principal behind Synthetic Aperture Radar;388
10.4.3;19.3 Problem Statement for SAR Image Processing;389
10.4.4;19.4 Block Adaptive Quantization;390
10.4.5;19.5 Signal processing functions for SAR;393
10.4.6;19.6 Implementation Details;394
10.4.7;19.7 Experimental Results;397
10.4.8;19.8 Conclusions;399
10.5;Chapter 20 VLSI Design Automation;400
10.5.1;20.1 CAD of Very Large Scale Integrated Circuits;400
10.5.2;20.2 Algorithmic Design and HDL Capture;401
10.5.3;20.3 Equation Processing;408
10.5.4;20.4 Microword Optimization using Local Search on SPU;414
10.5.5;20.5 Conclusion;418
10.6;Chapter 21 Floorplanning: VLSI and other Applications;419
10.6.1;21.1 What is a VLSI floorplan;419
10.6.2;21.2 The sequence-pair data-structure;419
10.6.3;21.3 Use of a floorplan in non-VLSI applications;421
10.6.4;21.4 Implementation of a Sequence-pair based Floorplanner;422
10.6.5;21.5 Conclusions;425
10.7;Chapter 22 VLSI Placement;426
10.7.1;22.1 Introduction to VLSI Standard Cell Placement;426
10.7.2;22.2 Reading gate-level BLIF files and making a graph;429
10.7.3;22.3 SPU basic cell placer, using Simulated Annealing;429
10.7.4;22.4 What is the VLSI global routing problem;451
10.7.5;22.5 Conclusion;453
10.8;Chapter 23 Power Estimation for VLSI;454
10.8.1;23.1 Introduction;454
10.8.2;23.2 Static Probabilities;455
10.8.3;23.3 Spatial Correlations;457
10.8.4;23.4 A distributed algorithm for signal probability estimation for combinational circuits;458
10.8.5;23.5 Bit-vector implementation of probability polynomials;461
10.8.6;23.6 Applications of PRBC;464
10.8.7;23.7 A Nested Data-Parallel Implementation of the Parker-McCluskey Method;465
10.8.8;23.8 Analysis of our method;467
10.8.9;23.9 PRBC Implementation;468
10.8.10;23.10 Probability Polynomial Evaluation on SPU;475
10.8.11;23.11 Experiments;476
10.8.12;23.12 Conclusion;477
10.9;Chapter 24 Concluding Remarks;478
10.9.1;24.1 Progress in small steps;478
10.9.2;24.2 Use of Common Lisp;478
10.9.3;24.3 Future Work;479
10.9.4;24.4 Conclusions;481
10.9.5;24.5 Website Information;481
11;Appendix A Introduction to Common Lisp;482
11.1;A.1 Introduction to Common Lisp;482
11.2;A.2 Functions in Common Lisp;483
11.3;A.3 File I/O;484
11.4;A.4 Conclusion;484
12;Index;485
13;References;496



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.