E-Book, Englisch, 296 Seiten
Gaster / Howes / Kaeli Heterogeneous Computing with OpenCL
1. Auflage 2011
ISBN: 978-0-12-387767-3
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
E-Book, Englisch, 296 Seiten
ISBN: 978-0-12-387767-3
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
Benedict R. Gaster is a software architect working on programming models for next-generation heterogeneous processors, in particular looking at high-level abstractions for parallel programming on the emerging class of processors that contain both CPUs and accelerators such as GPUs. Benedict has contributed extensively to the OpenCL's design and has represented AMD at the Khronos Group open standard consortium. Benedict has a Ph.D in computer science for his work on type systems for extensible records and variants.
Autoren/Hrsg.
Weitere Infos & Material
1;Front Cover;1
2;HeterogeneousComputing with OpenCL;4
3;Copyright;5
4;Contents;6
5;Foreword;8
6;Preface;12
6.1;Our Heterogeneous World;12
6.2;OpenCL;12
6.3;This Text;12
7;Acknowledgments;14
8;About the Authors;16
9;Chapter 1: Introduction to Parallel Programming;18
9.1;Introduction;18
9.2;OpenCL;18
9.3;The Goals of This Book;19
9.4;Thinking Parallel;19
9.5;Concurrency and Parallel Programming Models;23
9.6;Structure;28
9.7;Reference;29
9.8;Further Reading and Relevant Websites;30
10;Chapter 2: Introduction to OpenCL;32
10.1;Introduction;32
10.2;Platform and Devices;36
10.3;The Execution Environment;38
10.4;Memory Model;46
10.5;Writing Kernels;48
10.6;Full Source Code Example for Vector Addition;49
10.7;Summary;56
10.8;Reference;56
11;Chapter 3: OpenCL Device Architectures;58
11.1;Introduction;58
11.2;Hardware trade-offs;58
11.3;The architectural design space;72
11.4;Summary;81
11.5;References;82
12;Chapter 4: Basic OpenCL Examples;84
12.1;Introduction;84
12.2;Example Applications;84
12.3;Compiling OpenCL Host Applications;101
12.4;Summary;102
13;Chapter 5: Understanding OpenCL's Concurrency and Execution Model;104
13.1;Introduction;104
13.2;Kernels, Work-Items, Workgroups, and the Execution Domain;104
13.3;OpenCL Synchronization: Kernels, Fences, and Barriers;107
13.4;Queuing and Global Synchronization;111
13.5;The Host-Side Memory Model;126
13.6;The Device-Side Memory Model;132
13.7;Summary;139
14;Chapter 6: Dissecting a CPU/GPU OpenCL Implementation;140
14.1;Introduction;140
14.2;OpenCL on an AMD Phenom II X6;140
14.3;OpenCL on the AMD Radeon HD6970 GPU;145
14.4;Memory Performance Considerations in OpenCL;156
14.5;Summary;165
14.6;References;166
15;Chapter 7: OpenCL Case Study;168
15.1;Introduction;168
15.2;Convolution Kernel;168
15.3;Conclusions;178
15.4;Code Listings;179
15.5;Reference;188
16;Chapter 8: OpenCL Case Study;190
16.1;Introduction;190
16.2;Getting Video Frames;190
16.3;Processing a Video in OpenCL;196
16.4;Processing Multiple Videos with Multiple Special Effects;197
16.5;Display to Screen of Final Output;198
16.6;Summary;201
17;Chapter 9: OpenCL Case Study;202
17.1;Introduction;202
17.2;Choosing the Number of Workgroups;202
17.3;Choosing the Optimal Workgroup Size;203
17.4;Optimizing Global Memory Data Access Patterns;204
17.5;Using Atomics to Perform Local Histogram;206
17.6;Optimizing Local Memory Access;207
17.7;Local Histogram Reduction;209
17.8;The Global Reduction;210
17.9;Full Kernel Code;210
17.10;Performance and Summary;213
18;Chapter 10: OpenCL Case Study;214
18.1;Introduction;214
18.2;Overview of the Computation;215
18.3;GPU Implementation;217
18.4;CPU Implementation;219
18.5;Load Balancing;220
18.6;Performance and Summary;221
18.7;Kernel for Uniform Grid Creation;222
18.8;Kernels for Simulation;223
19;Chapter 11: OpenCL Extensions;228
19.1;Introduction;228
19.2;Overview of Extension Mechanism;228
19.3;Device Fission;231
19.4;Double Precision;242
19.5;References;250
20;Chapter 12: OpenCL Profiling and Debugging;252
20.1;Introduction;252
20.2;Profiling with Events;253
20.3;AMD Accelerated Parallel Processing Profiler;255
20.4;AMD Accelerated Parallel Processing KernelAnalyzer;260
20.5;Walking through the AMD APP Profiler;262
20.6;Debugging OpenCL Applications;265
20.7;Overview of gDEBugger;266
20.8;AMD Printf Extension;268
20.9;Conclusion;270
21;Chapter 13: WebCL;272
21.1;Introduction;272
21.2;Designing the Framework;273
21.3;WebCL Pilot Implementation;274
21.4;WebCL Hands-on;277
21.5;Web Photo Editor;281
21.6;Discussion;283
21.7;Summary;285
21.8;Reference;285
21.9;Further Reading and Relevant Websites;286
22;Index;288




