E-Book, Englisch, 253 Seiten
Panda / Silpa / Shrivastava Power-efficient System Design
1. Auflage 2010
ISBN: 978-1-4419-6388-8
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 253 Seiten
ISBN: 978-1-4419-6388-8
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
The Information and communication technology (ICT) industry is said to account for 2% of the worldwide carbon emissions - a fraction that continues to grow with the relentless push for more and more sophisticated computing equipment, c- munications infrastructure, and mobile devices. While computers evolved in the directionofhigherandhigherperformanceformostofthelatterhalfofthe20thc- tury, the late 1990's and early 2000'ssaw a new emergingfundamentalconcern that has begun to shape our day-to-day thinking in system design - power dissipation. As we elaborate in Chapter 1, a variety of factors colluded to raise power-ef?ciency as a ?rst class design concern in the designer's mind, with profound consequences all over the ?eld: semiconductor process design, circuit design, design automation tools, system and application software, all the way to large data centers. Power-ef?cient System Design originated from a desire to capture and highlight the exciting developments in the rapidly evolving ?eld of power and energy op- mization in electronic and computer based systems. Tremendous progress has been made in the last two decades, and the topic continues to be a fascinating research area. To develop a clearer focus, we have concentrated on the relatively higher level of design abstraction that is loosely called the system level. In addition to the ext- sive coverage of traditional power reduction targets such as CPU and memory, the book is distinguished by detailed coverage of relatively modern power optimization ideas focussing on components such as compilers, operating systems, servers, data centers, and graphics processors.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;6
2;Contents;8
3;1 Low Power Design: An Introduction;12
3.1;1.1 The Emergence of Power as an Important Design Metric;12
3.2;1.2 Power Efficiency vs. Energy Efficiency;14
3.3;1.3 Power-Performance Tradeoff;17
3.4;1.4 Power Density;18
3.5;1.5 Power and Energy Benchmarks;18
3.6;1.6 Power Optimizations at the System Level;18
3.7;1.7 Organization of this Book;20
3.8;References;21
4;2 Basic Low Power Digital Design;22
4.1;2.1 CMOS Transistor Power Consumption;22
4.1.1;2.1.1 Switching Power;24
4.1.2;2.1.2 Short Circuit Power;25
4.1.3;2.1.3 Leakage Power;26
4.1.3.1;2.1.3.1 Reverse Biased Diode Leakage;26
4.1.3.2;2.1.3.2 Gate Induced Drain Leakage;26
4.1.3.3;2.1.3.3 Gate Oxide Tunneling;27
4.1.3.4;2.1.3.4 Subthreshold Leakage;28
4.2;2.2 Trends in Power Consumption;29
4.3;2.3 Techniques for Reducing Dynamic Power;29
4.3.1;2.3.1 Gate Sizing;31
4.3.2;2.3.2 Control Synthesis;34
4.3.3;2.3.3 Clock Gating;36
4.3.4;2.3.4 Voltage and Frequency Scaling;39
4.3.4.1;2.3.4.1 Design-Time Voltage and Frequency Setting;40
4.3.4.2;2.3.4.2 Static Voltage and Frequency Scaling;40
4.3.4.3;2.3.4.3 Dynamic Voltage and Frequency Scaling;41
4.4;2.4 Techniques for Reducing Short Circuit Power;41
4.5;2.5 Techniques for Reducing Leakage Power;42
4.5.1;2.5.1 Multiple Supply Voltage;42
4.5.2;2.5.2 Multiple Threshold Voltage;44
4.5.3;2.5.3 Adaptive Body Biasing;45
4.5.4;2.5.4 Transistor Stacking;45
4.5.5;2.5.5 Power Gating;47
4.6;2.6 Summary;48
4.7;References;49
5;3 Power-efficient Processor Architecture;52
5.1;3.1 Introduction;52
5.1.1;3.1.1 Power Budget: A Major Design Constraint;56
5.1.1.1;3.1.1.1 Why does Parallel Processing Reduce Power?;56
5.1.2;3.1.2 Processor Datapath Architecture;59
5.1.2.1;3.1.2.1 Instruction Fetch;60
5.1.2.2;3.1.2.2 Decode and Dispatch;61
5.1.2.3;3.1.2.3 Issue;62
5.1.2.4;3.1.2.4 Execute;62
5.1.2.5;3.1.2.5 Commit;63
5.1.3;3.1.3 Power Dissipation;63
5.2;3.2 Front-end: Fetch and Decode Logic;65
5.2.1;3.2.1 Fetch Gating;65
5.2.1.1;3.2.1.1 Branch Confidence Estimation;66
5.2.1.2;3.2.1.2 Rate Mismatch Flow Control;68
5.2.2;3.2.2 Auxiliary Decode Buffer;69
5.3;3.3 Issue Queue / Dispatch Buffer;71
5.3.1;3.3.1 Dynamic Adaptation of Issue Queue Size;74
5.3.2;3.3.2 Zero Byte Encoding;75
5.3.3;3.3.3 Banking and Bit-line Segmentation;76
5.3.4;3.3.4 Fast Comparators;77
5.4;3.4 Register File;77
5.4.1;3.4.1 Port Reduction and Banking;78
5.4.1.1;3.4.1.1 Reducing Port Requirements;78
5.4.1.2;3.4.1.2 Banking;80
5.4.2;3.4.2 Clustered Organization;81
5.4.3;3.4.3 Hierarchical Organization;81
5.5;3.5 Execution Units;83
5.5.1;3.5.1 Clock Gating;84
5.5.2;3.5.2 Operand Isolation/Selective Evaluation;84
5.5.3;3.5.3 Power Gating and Multi-threshold Logic;85
5.5.3.1;3.5.3.1 Time Based;86
5.5.3.2;3.5.3.2 Branch Prediction Based;87
5.5.3.3;3.5.3.3 Compiler Based;88
5.6;3.6 Reorder Buffer;88
5.6.1;3.6.1 Port Reduction;90
5.6.2;3.6.2 Distributed ROB;91
5.6.3;3.6.3 Dynamic ROB Sizing;92
5.6.4;3.6.4 Zero Bytes and Power Efficient Comparators;92
5.7;3.7 Branch Prediction Unit;92
5.7.1;3.7.1 Banking of BHT and BTB;95
5.7.2;3.7.2 Reducing BHT/BTB Lookups;95
5.8;3.8 Summary;96
5.9;References;97
6;4 Power-efficient Memory and Cache;100
6.1;4.1 Introduction and Memory Structure;101
6.1.1;4.1.1 Overview;101
6.1.2;4.1.2 Memory Structure;102
6.1.3;4.1.3 Cache Memory;103
6.1.4;4.1.4 Cache Architecture;109
6.1.5;4.1.5 Power Dissipation During Memory Access;111
6.2;4.2 Power-efficient Memory Architectures;112
6.2.1;4.2.1 Partitioned Memory and Caches;112
6.2.2;4.2.2 Augmenting with Additional Memories;114
6.2.3;4.2.3 Reducing Tag and Data Array Fetches;116
6.2.4;4.2.4 Reducing Cache Leakage Power;120
6.3;4.3 Translation Look-aside Buffer (TLB);122
6.3.1;4.3.1 TLB Associativity – A Power-performance Trade-off;124
6.3.2;4.3.2 Banking;124
6.3.3;4.3.3 Reducing TLB Lookups;126
6.3.3.1;4.3.3.1 Deferred Address Translation;126
6.3.3.2;4.3.3.2 Using Address Mapping Register;126
6.4;4.4 Scratch Pad Memory;127
6.4.1;4.4.1 Data Placement in SPM;128
6.4.2;4.4.2 Dynamic Management of SPM;130
6.4.3;4.4.3 Storing both Instructions and Data in SPM;132
6.5;4.5 Memory Banking;132
6.6;4.6 Memory Customization;135
6.7;4.7 Reducing Address Bus Switching;140
6.7.1;4.7.1 Encoding;140
6.7.2;4.7.2 Data Layout;142
6.8;4.8 DRAM Power Optimization;145
6.9;4.9 Summary;146
6.10;References;147
7;5 Power Aware Operating Systems, Compilers, and ApplicationSoftware;150
7.1;5.1 Operating System Optimizations;151
7.1.1;5.1.1 Advanced Configuration and Power Interface (ACPI);155
7.1.1.1;5.1.1.1 Power Modes;157
7.1.2;5.1.2 Dynamic Voltage and Frequency Scaling;159
7.1.2.1;5.1.2.1 DVFS in Real-time OS;162
7.1.3;5.1.3 I/O Device Power Management;171
7.2;5.2 Compiler Optimizations;172
7.2.1;5.2.1 Loop Transformations;173
7.2.2;5.2.2 Instruction Encoding;173
7.2.3;5.2.3 Instruction Scheduling;175
7.2.4;5.2.4 Dual Instruction Set Architectures;175
7.2.5;5.2.5 Instruction Set Extension;179
7.2.6;5.2.6 Power Gating;182
7.2.7;5.2.7 Dynamic Translation and Recompilation;183
7.2.8;5.2.8 Compiler Optimizations Targeting Disks;184
7.3;5.3 Application Software;185
7.3.1;5.3.1 Application-aided Power Management;185
7.3.2;5.3.2 DVFS Under Application Control;186
7.3.2.1;5.3.2.1 MPEG Video Decoder;186
7.3.2.2;5.3.2.2 Word Processor;187
7.3.2.3;5.3.2.3 Batch Compilation;188
7.3.3;5.3.3 Output Quality Trade-offs;188
7.4;5.4 Summary;189
7.5;References;189
8;6 Power Issues in Servers and Data Centers;194
8.1;6.1 Power Efficiency Challenges;194
8.1.1;6.1.1 Nameplate Power Overestimates Actual Power;195
8.1.2;6.1.2 Installed vs. Utilized Capacity;196
8.1.3;6.1.3 Load Variation;196
8.2;6.2 Where does the Power go?;197
8.3;6.3 Server Power Modeling and Measurement;199
8.4;6.4 Server Power Management;201
8.4.1;6.4.1 Frequency Scaling;201
8.4.2;6.4.2 Processor and Memory Packing;204
8.4.3;6.4.3 Power Shifting;207
8.5;6.5 Cluster and Data Center Power Management;208
8.5.1;6.5.1 Power Capping/Thresholding;209
8.5.2;6.5.2 Voltage and Frequency Scaling;212
8.6;6.6 Summary;215
8.7;References;215
9;7 Low Power Graphics Processors;218
9.1;7.1 Introduction to Graphics Processing;219
9.1.1;7.1.1 Graphics Pipeline;219
9.1.1.1;7.1.1.1 Application Stage;219
9.1.1.2;7.1.1.2 Geometry;221
9.1.1.3;7.1.1.3 Triangle Setup;223
9.1.1.4;7.1.1.4 Rasterization;224
9.1.1.5;7.1.1.5 Display;226
9.1.2;7.1.2 Graphics Processor Architecture;227
9.1.3;7.1.3 Power Dissipation in a Graphics Processor;232
9.2;7.2 Programmable Units;233
9.2.1;7.2.1 Clock Gating;234
9.2.2;7.2.2 Predictive Shutdown;234
9.2.3;7.2.3 Code Transformation;235
9.3;7.3 Texture Unit;239
9.3.1;7.3.1 Custom Memory Architecture – Texture Filter Memory;240
9.3.2;7.3.2 Texture Compression;243
9.3.3;7.3.3 Clock Gating;245
9.4;7.4 Raster Operations;246
9.4.1;7.4.1 Depth Buffer Compression;246
9.4.2;7.4.2 Color Buffer Compression;248
9.5;7.5 System Level Power Management;249
9.5.1;7.5.1 Power Modes;249
9.5.2;7.5.2 Dynamic Voltage and Frequency Scaling;249
9.5.2.1;7.5.2.1 History based Workload Estimation;252
9.5.2.2;7.5.2.2 Control Theory based Workload Estimation;252
9.5.2.3;7.5.2.3 Frame Structure based Workload Estimation;254
9.5.2.4;7.5.2.4 Signature based Workload Estimation;255
9.5.3;7.5.3 Multiple Power Domains;256
9.6;7.6 Summary;256
9.7;References;257
10;Index;260




