Gruber / Keller | HPC@Green IT | E-Book | www.sack.de
E-Book

E-Book, Englisch, 221 Seiten

Gruber / Keller HPC@Green IT

Green High Performance Computing Methods
1. Auflage 2010
ISBN: 978-3-642-01789-6
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark

Green High Performance Computing Methods

E-Book, Englisch, 221 Seiten

ISBN: 978-3-642-01789-6
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark



Making the most ef?cient use of computer systems has rapidly become a leading topic of interest for the computer industry and its customers alike. However, the focus of these discussions is often on single, isolated, and speci?c architectural and technological improvements for power reduction and conservation, while ignoring the fact that power ef?ciency as a ratio of performance to power consumption is equally in?uenced by performance improvements and architectural power red- tion. Furthermore, ef?ciency can be in?uenced on all levels of today's system hi- archies from single cores all the way to distributed Grid environments. To improve execution and power ef?ciency requires progress in such diverse ?elds as program optimization, optimization of program scheduling, and power reduction of idling system components for all levels of the system hierarchy. Improving computer system ef?ciency requires improving system performance and reducing system power consumption. To research and reach reasonable conc- sions about system performance we need to not only understand the architectures of our computer systems and the available array of code transformations for p- formance optimizations, but we also need to be able to express this understanding in performance models good enough to guide decisions about code optimizations for speci?c systems. This understanding is necessary on all levels of the system hierarchy from single cores to nodes to full high performance computing (HPC) systems, and eventually to Grid environments with multiple systems and resources.

Ralf Gruber won the Cray Gigaflop Performance Award in 1989 with world's fastest parallel program running at 1.7 GFlop/s sustained. He was responsible for the Swiss-Tx cluster project, a co-operation between EPFL, Compaq, and Supercomputing Systems. Since 6 years he teaches the doctoral school course on 'High Performance Computing Methods'.Vincent Keller received his Master degree in Computer Science from the University of Geneva (Switzerland) in 2004, and his PhD degree in 2008 from the Swiss Federal Institute of Technology (EPFL) in the HPCN and HPC Grids fields. Since 2009, Dr. Vincent Keller holds a full-time researcher position at University of Bonn in Germany. His research interests are in HPC applications analysis, Grid and cluster computing and energy efficiency of large computing ecosystems.

Gruber / Keller HPC@Green IT jetzt bestellen!

Weitere Infos & Material


1;Foreword;5
2;Preface;7
3;Acknowledgements;9
4;Contents;11
5;1 Introduction ;16
5.1; Basic goals of the book;16
5.2; What do I get for one Watt today ?;16
5.3; Main memory bottleneck;18
5.4; Optimize resource usage;18
5.5; Application design;19
5.6; Organization of the book;19
5.6.1; Historical aspects;19
5.6.2; Parameterization;20
5.6.3; Models;20
5.6.4; Core optimization;21
5.6.5; Node optimization;21
5.6.6; Cluster optimization;21
5.6.7; Grid-brokering to save energy;22
6;2 Historical highlights ;23
6.1; Evolution of computing;23
6.2; The first computer companies;28
6.2.1; ERA, EMCC and Univac;28
6.2.2; Control Data Corporation, CDC;28
6.2.3; Cray Research;29
6.2.4; Thinking Machines Corporation;30
6.2.5; International Business Machines (IBM);31
6.2.6; The ASCI effort;32
6.2.7; The Japanese efforts;33
6.3; The computer generations;34
6.4; The evolution in computing performance;34
6.5; Performance/price evolution;36
6.6; Evolution of basic software;36
6.7; Evolution of algorithmic complexity;37
6.8; The TOP500 list;39
6.8.1; Outlook with the TOP500 curves;41
6.8.2; The GREEN500 List;42
6.8.3; Proposal for a REAL500 list;44
7;3 Parameterization;45
7.1; Definitions;45
7.2; Parameterization of applications;49
7.2.1; Application parameter set;49
7.2.2; Parameterization of BLAS library routines;50
7.2.3; SMXV: Parameterization of sparse matrix*vector operation;52
7.3; Parameterization of a computational nodes Pi ri;53
7.4; Parameterization of the interconnection networks;55
7.4.1; Types of networks;55
7.4.2; Parameterization of clusters and networks;56
7.5; Parameters related to running applications;58
7.6; Conclusion;61
8;4 Models ;62
8.1; The performance prediction model;62
8.2; The execution time evaluation model (ETEM);66
8.3; A network performance model;66
8.4; The extended -- model;68
8.5; Validation of the models;69
8.5.1; Methodology;69
8.5.2; Example: The full matrix*matrix multiplication DGEMM;70
8.5.3; Example: Sparse matrix*vector multiplication SMXV;72
9;5 Core optimization ;75
9.1; Some useful notions;75
9.1.1; Data hierarchy;75
9.1.2; Data representation;76
9.1.3; Floating point operations;79
9.1.4; Pipelining;80
9.2; Single core optimization;82
9.2.1; Single core architectures;82
9.2.2; Memory conflicts;82
9.2.3; Indirect addressing;86
9.2.4; Unrolling;87
9.2.5; Dependency;88
9.2.6; Inlining;90
9.2.7; If statement in a loop;90
9.2.8; Code porting aspects;91
9.2.9; How to develop application software;95
9.3; Application to plasma physics codes;96
9.3.1; Tokamaks and Stellerators;96
9.3.2; Optimization of VMEC;100
9.3.3; Optimization of TERPSICHORE;103
9.3.4; Conclusions for single core optimization;106
10;6 Node optimization;107
10.1; Shared memory computer architectures;107
10.1.1; SMP/NUMA architectures;107
10.1.2; The Cell;111
10.1.3; GPGPU for HPC;112
10.2; Node comparison and OpenMP;117
10.2.1; Race condition with OpenMP;121
10.3; Application optimization with OpenMP: the 3DHelmholtz solver;122
10.3.1; Fast Helmholtz solver for parallelepipedic geometries;123
10.3.2; NEC SX-5 reference benchmark;125
10.3.3; Single processor benchmarks;126
10.3.4; Parallelization with OpenMP;127
10.3.5; Parallelization with MPI;127
10.3.6; Conclusion;131
10.4; Application optimization with OpenMP: TERPSICHORE ;131
11;7 Cluster optimization ;133
11.1; Introduction on parallelization;133
11.2; Internode communication networks;133
11.2.1; Network architectures;133
11.2.2; Comparison between network architectures;141
11.3; Distributed memory parallel computer architectures;143
11.3.1; Integrated parallel computer architectures;143
11.3.2; Commodity cluster architectures;146
11.3.3; Energy consumption issues;148
11.3.4; The issue of resilience;149
11.4; Type of parallel applications;150
11.4.1; Embarrassingly parallel applications;150
11.4.2; Applications with point-to-point communications;150
11.4.3; Applications with multicast communication needs;151
11.4.4; Shared memory applications (OpenMP);151
11.4.5; Components based applications;151
11.5; Domain decomposition techniques;151
11.5.1; Test example: The Gyrotron;152
11.5.2; The geometry and the mesh;154
11.5.3; Connectivity conditions;154
11.5.4; Parallel matrix solver;155
11.5.5; The electrostatic precipitator;157
11.6; Scheduling of parallel applications;158
11.6.1; Static scheduling;158
11.6.2; Dynamic scheduling;158
11.7; SpecuLOOS;159
11.7.1; Introduction;159
11.7.2; Test case description;159
11.7.3; Complexity on one node;161
11.7.4; Wrong complexity on the Blue Gene/L;162
11.7.5; Fine results on the Blue Gene/L;163
11.7.6; Conclusions;163
11.8; TERPSICHORE;165
11.9; Parallelization of the LEMan code with MPI and OpenMP;166
11.9.1; Introduction;166
11.9.2; Parallelization;166
11.9.3; CPU time results;168
11.10; Conclusions;171
12;8 Grid-level Brokering to save energy ;173
12.1; About Grid resource brokering;173
12.2; An Introduction to ïanos;174
12.2.1; Job Submission Scenario;176
12.3; The cost model;177
12.3.1; Mathematical formulation;177
12.3.2; CPU costs Ke;179
12.3.3; License fees K;181
12.3.4; Costs due to waiting time Kw;181
12.3.5; Energy costs Keco;181
12.3.6; Data transfer costs Kd;183
12.3.7; Example: The Pleiades clusters CPU cost per hour;183
12.3.8; Different currencies in a Grid environment;185
12.4; The implementation;185
12.4.1; Architecture & Design;186
12.4.2; The Grid Adapter;186
12.4.3; The Meta Scheduling Service (MSS);187
12.4.4; The Resource Broker;188
12.4.5; The System Information;189
12.4.6; The Data Warehouse;189
12.4.7; The Monitoring Service;189
12.4.8; The Monitoring Module VAMOS;190
12.4.9; Integration with UNICORE Grid System;191
12.4.10; Scheduling algorithm;191
12.4.11; User Interfaces to the ïanos framework;193
12.5; DVS-able processors;194
12.5.1; Power consumption of a CPU;195
12.5.2; An algorithm to save energy;196
12.5.3; First results with SMXV;197
12.5.4; A first implementation;198
12.6; Conclusions;200
13;9 Recommendations ;201
13.1; Application oriented recommendations;201
13.1.1; Code development;201
13.1.2; Code validation;201
13.1.3; Porting codes;202
13.1.4; Optimizing parallelized applications;202
13.1.5; Race condition;202
13.2; Hardware and basic software aspects;203
13.2.1; Basic software;203
13.2.2; Choice of system software;204
13.3; Energy reduction;204
13.3.1; Processor frequency adaptation;204
13.3.2; Improved cooling;205
13.3.3; Choice of optimal resources;205
13.3.4; Best choice of new computer;205
13.3.5; Last but not least;206
13.4; Miscellaneous;206
13.4.1; Course material;206
13.4.2; A new REAL500 List;206
14;Glossary;208
15;References;215
16;About the authors;222
17;Index;224



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.