E-Book, Englisch, 382 Seiten
Iniewski CMOS Processors and Memories
1. Auflage 2010
ISBN: 978-90-481-9216-8
Verlag: Springer Netherlands
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 382 Seiten
Reihe: Analog Circuits and Signal Processing
ISBN: 978-90-481-9216-8
Verlag: Springer Netherlands
Format: PDF
Kopierschutz: 1 - PDF Watermark
CMOS Processors and Memories addresses the-state-of-the-art in integrated circuit design in the context of emerging computing systems. New design opportunities in memories and processor are discussed. Emerging materials that can take system performance beyond standard CMOS, like carbon nanotubes, graphene, ferroelectrics and tunnel junctions are explored. CMOS Processors and Memories is divided into two parts: processors and memories. In the first part we start with high performance, low power processor design, followed by a chapter on multi-core processing. They both represent state-of-the-art concepts in current computing industry. The third chapter deals with asynchronous design that still carries lots of promise for future computing needs. At the end we present a 'hardware design space exploration' methodology for implementing and analyzing the hardware for the Bayesian inference framework. This particular methodology involves: analyzing the computational cost and exploring candidate hardware components, proposing various custom architectures using both traditional CMOS and hybrid nanotechnology CMOL. The first part concludes with hybrid CMOS-Nano architectures. The second, memory part covers state-of-the-art SRAM, DRAM, and flash memories as well as emerging device concepts. Semiconductor memory is a good example of the full custom design that applies various analog and logic circuits to utilize the memory cell's device physics. Critical physical effects that include tunneling, hot electron injection, charge trapping (Flash memory) are discussed in detail. Emerging memories like FRAM, PRAM and ReRAM that depend on magnetization, electron spin alignment, ferroelectric effect, built-in potential well, quantum effects, and thermal melting are also described. CMOS Processors and Memories is a must for anyone serious about circuit design for future computing technologies. The book is written by top notch international experts in industry and academia. It can be used in graduate course curriculum.
Krzysztof (Kris) Iniewski is managing R&D at Redlen Technologies Inc., a start-up company in British Columbia. His research interests are in VLSI circuits for medical and security applications. He is also an executive director of CMOS Emerging Technologies (www.cmoset.com). From 2004 to 2006 he was an Associate Professor at the Electrical Engineering and Computer Engineering Department of University of Alberta where he conducted research on low power wireless circuits and systems. During his tenure in Edmonton he put together a book for CRC Press 'Wireless Technologies: Circuits, Systems and Devices'. From 1995 to 2003, he was with PMC-Sierra and held various senior technical and management positions. Prior to joining PMC-Sierra, from 1990 to 1994 he was an Assistant Professor at the University of Toronto's Electrical Engineering and Computer Engineering. Dr. Iniewski has published over 100 research papers in international journals and conferences. He holds 18 international patents granted in USA, Canada, France, Germany, and Japan. He received his Ph.D. degree in electronics (honors) from the Warsaw University of Technology (Warsaw, Poland) in 1988. Together with Carl McCrosky and Dan Minoli he is an author of 'Data Networks - VLSI and Optical Fibre', Wiley, 2008. He recently edited 'Medical Imaging Electronics', Wiley 2009, 'VLSI Circuits for Bio-medical applications', Artech House 2008, and 'Circuits at Nanoscale: Communications, Imaging and Sensing', CRC Press 2008.
Autoren/Hrsg.
Weitere Infos & Material
1;Contents;6
2;Part I:Processors;8
2.1;Chapter 1: Design of High Performance Low Power Microprocessors;9
2.1.1;1.1 Introduction;10
2.1.2;1.2 Concurrent Multi-threading (CMT);10
2.1.3;1.3 Power and Power Management;11
2.1.3.1;1.3.1 Dynamic Power;11
2.1.3.1.1;1.3.1.1 Activity Factor and Switching Capacitance;12
2.1.3.1.2;1.3.1.2 Voltage (VDD) and Frequency of Operation;13
2.1.3.1.3;1.3.1.3 Crowbar Power;14
2.1.3.2;1.3.2 Static (Leakage) Power;14
2.1.3.2.1;1.3.2.1 Sub-threshold Leakage;14
2.1.3.2.2;1.3.2.2 Gate Leakage;15
2.1.3.2.3;1.3.2.3 Diode Leakage;15
2.1.3.2.4;1.3.2.4 Effect of VDD and Temperature on Leakage;16
2.1.3.2.5;1.3.2.5 Back Bias;18
2.1.3.3;1.3.3 Using VDD and Back-Bias to Optimize for Performance and Power;18
2.1.3.4;1.3.4 Power Management: What and How?;20
2.1.3.4.1;1.3.4.1 Dynamic Voltage and Frequency Scaling;20
2.1.3.4.2;1.3.4.2 Other Power Management Techniques;22
2.1.4;1.4 Clock Design;23
2.1.4.1;1.4.1 Clock Skew/Clock Uncertainty;26
2.1.4.1.1;1.4.1.1 Sources of Clock Skew/Clock Uncertainty;26
2.1.5;1.5 Memory Design;27
2.1.5.1;1.5.1 The 6-T Memory Cell;28
2.1.5.1.1;1.5.1.1 Important Metrics/Tests for Evaluating a 6-T Memory Cell;29
2.1.5.2;1.5.2 Memory Redundancy;30
2.1.5.3;1.5.3 The Importance of Statistical Analysis;30
2.1.6;1.6 Process Technology and Impact of Layout on Performance and Power;31
2.1.7;1.7 Conclusion;32
2.1.8;References;32
2.2;Chapter 2: Towards High-Performance and Energy-Efficient Multi-core Processors;34
2.2.1;2.1 Motivating Multi-core Processors;34
2.2.1.1;2.1.1 Challenges on Uni-core Processors;34
2.2.1.1.1;2.1.1.1 High Performance Innovations are Challenging;35
2.2.1.1.2;2.1.1.2 Power Dissipation Becomes the Key Constraint;35
2.2.1.1.3;2.1.1.3 The Gap Between Dream and Reality on Performance and Energy Efficiency;37
2.2.1.1.4;2.1.1.4 Future Fabrication Technologies Imposing New Challenges;37
2.2.1.2;2.1.2 Challenges on ASIC Implementations;38
2.2.1.3;2.1.3 Solution: Multi-core Processors;39
2.2.2;2.2 Pioneering Multiprocessor Systems and Multi-core Processors;41
2.2.2.1;2.2.1 Communication Model: Shared-Memory and Message Passing;41
2.2.2.2;2.2.2 Interconnect Topology;42
2.2.2.3;2.2.3 Some Design Cases;42
2.2.3;2.3 Modern Multi-core Processors;43
2.2.3.1;2.3.1 Design Cases of Modern Multi-core Processors;44
2.2.3.2;2.3.2 Distinguishing Multi-core Processors;48
2.2.4;2.4 Looking Forward to the Future of Multi-core Processors;51
2.2.4.1;2.4.1 There is no Universal Multi-core Processor;51
2.2.4.2;2.4.2 Fault-Tolerance Will Become a Key Issue in Multi-core Processors;52
2.2.5;Reference;53
2.3;Chapter 3: Low Power Asynchronous Circuit Design: An FFT/IFFT Processor;57
2.3.1;3.1 Introduction;57
2.3.2;3.2 Synchronization: Synchronous and Asynchronous;58
2.3.2.1;3.2.1 Synchronous Approach;59
2.3.2.2;3.2.2 Asynchronous Approach;60
2.3.2.2.1;3.2.2.1 Delay Models;61
2.3.2.2.2;3.2.2.2 Handshaking Protocols and Channels;61
2.3.2.2.3;3.2.2.3 Data Encoding;64
2.3.2.2.4;3.2.2.4 Asynchronous Pipelines;64
2.3.3;3.3 Low Power Asynchronous Micro/Macro Cells;65
2.3.3.1;3.3.1 Latch Adder;65
2.3.3.2;3.3.2 Asynchronous Carry Completion Sensing Adders;68
2.3.3.3;3.3.3 Multiplier;75
2.3.3.4;3.3.4 Memory;82
2.3.4;3.4 Low Power Asynchronous FFT/IFFT Processor;84
2.3.4.1;3.4.1 FFT/IFFT Algorithm;85
2.3.4.2;3.4.2 Benchmarked Synchronous FFT/IFFT Processor;86
2.3.4.3;3.4.3 Asynchronous FFT/IFFT Processor;87
2.3.4.4;3.4.4 Comparison of the Synchronous and Asynchronous Designs;92
2.3.5;3.5 Conclusions;97
2.3.6;References;98
2.4;Chapter 4: CMOL/CMOS Implementations of Bayesian Inference Engine: Digital and Mixed-Signal Architectures and Performance/Price;100
2.4.1;4.1 Introduction;101
2.4.2;4.2 Hardware for Computational Models;103
2.4.2.1;4.2.1 Hardware Virtualization Spectrum;103
2.4.2.2;4.2.2 Existing Hardware Implementations of George and Hawkins’ Model;104
2.4.2.3;4.2.3 Hardware Design Space Exploration: An Architecture Assessment Methodology;105
2.4.3;4.3 A Bayesian Memory (BM) Module;105
2.4.4;4.4 Hardware Architectures for Bayesian Memory;109
2.4.4.1;4.4.1 Definition of Hardware Architectures for BM;109
2.4.4.2;4.4.2 General Issues;111
2.4.4.2.1;4.4.2.1 Precision/Bits;111
2.4.4.2.2;4.4.2.2 Communication;112
2.4.4.2.3;4.4.2.3 Number of Parent and Child BMs, and Code Book (CB) Size;112
2.4.4.2.4;4.4.2.4 Virtualization;113
2.4.4.2.5;4.4.2.5 Hybrid Nanotechnology – CMOL;113
2.4.5;4.5 Digital CMOS and CMOL Hardware Architectures for Bayesian Memory (BM);114
2.4.5.1;4.5.1 Floating-Point (FLP) Architecture;115
2.4.5.2;4.5.2 Logarithmic Number System (LNS) Architecture;116
2.4.5.3;4.5.3 Fixed-Point (FXP) Architecture;116
2.4.6;4.6 Mixed-Signal (MS) CMOS and CMOL Hardware Architectures for Bayesian Memory (BM);118
2.4.6.1;4.6.1 Mixed-Signal CMOS Architecture;119
2.4.6.2;4.6.2 Mixed-Signal CMOL Architecture;121
2.4.7;4.7 Performance/Price Analysis and Results;122
2.4.7.1;4.7.1 Performance/Price Analysis;122
2.4.7.2;4.7.2 Performance/Price Results and Discussion;124
2.4.7.3;4.7.3 Scaling Estimates for BM Based Cortex-Scale System;129
2.4.8;4.8 Conclusion, Contribution and Future Work;131
2.4.9;4.9 Appendix;132
2.4.9.1;4.9.1 Digital FLP or LNS Architecture;132
2.4.9.1.1;4.9.1.1 Time;132
2.4.9.1.2;4.9.1.2 Area;133
2.4.9.1.3;4.9.1.3 Power;133
2.4.9.2;4.9.2 Digital FXP Architecture;133
2.4.9.2.1;4.9.2.1 Time;133
2.4.9.2.2;4.9.2.2 Area;134
2.4.9.2.3;4.9.2.3 Power;134
2.4.9.3;4.9.3 Mixed-Signal CMOS Architecture;135
2.4.9.3.1;4.9.3.1 Time;135
2.4.9.3.2;4.9.3.2 Area;135
2.4.9.3.3;4.9.3.3 Power;135
2.4.9.4;4.9.4 Mixed-Signal CMOL Architecture;136
2.4.9.4.1;4.9.4.1 MS CMOL Nanogrid for the SVMM;136
2.4.9.4.2;4.9.4.2 Time;136
2.4.9.4.3;4.9.4.3 Area;137
2.4.9.4.4;4.9.4.4 Power;137
2.4.9.5;4.9.5 Example: Use of Architecture Assessment Methodology for Associative Memory Model;137
2.4.10;References;139
2.5;Chapter 5: A Hybrid CMOS-Nano FPGA Based on Majority Logic: From Devices to Architecture;142
2.5.1;5.1 Introduction;142
2.5.2;5.2 Nanoscale Technologies and Devices;144
2.5.2.1;5.2.1 Nanoscale Switches and the Crossbar Array;144
2.5.2.1.1;5.2.1.1 Nanoscale Switches;145
2.5.2.1.1.1;Self Assembled Molecular Electronics;145
2.5.2.1.1.2;Phase Change Devices;146
2.5.2.1.1.3;Generic MRAM Device;146
2.5.2.1.1.4;Metal Oxide Device;148
2.5.2.1.2;5.2.1.2 Resonant Tunneling Diodes;148
2.5.2.2;5.2.2 Fundamental Circuits;150
2.5.2.2.1;5.2.2.1 Crossbar Array;150
2.5.2.2.2;5.2.2.2 Programmable Logic Array (PLA);150
2.5.2.2.3;5.2.2.3 Goto Pair;151
2.5.2.3;5.2.3 Device Modelling;152
2.5.3;5.3 The Programmable Majority Logic Array;154
2.5.4;5.4 A CMOS-Nano PMLA Based FPGA;157
2.5.4.1;5.4.1 Partitioning Logic Between CMOS and Nano;158
2.5.4.1.1;5.4.1.1 All Nano PMLA Mapping;158
2.5.4.1.2;5.4.1.2 Equal Partitioning Between CMOS and Nano;159
2.5.4.2;5.4.2 Results and Comparisons;159
2.5.4.3;5.4.3 Impact: Considerations for Designing CMOS/Nano Circuits;160
2.5.5;5.5 Future Prospects (Memristors);161
2.5.6;5.6 Summary;162
2.5.7;References;162
3;Part II:Memories;165
3.1;Chapter 6: Memory Systems for Nano-computer;166
3.1.1;6.1 Introduction;166
3.1.1.1;6.1.1 The Value of a Computer;166
3.1.1.2;6.1.2 The Origin of a Computer Body;167
3.1.1.3;6.1.3 The Birth of a Memory Device;168
3.1.1.4;6.1.4 The Brief on a Computer System;169
3.1.1.5;6.1.5 The Brief on a Memory Hierarchy;169
3.1.2;6.2 Memory Devices and Circuits;170
3.1.2.1;6.2.1 The Foundation of a Memory Core;170
3.1.2.2;6.2.2 The Foundation of a Memory Design;173
3.1.2.2.1;6.2.2.1 Analog Circuits for a Memory Device;173
3.1.2.2.2;6.2.2.2 Logic Circuits for a Memory Device;179
3.1.2.3;6.2.3 The Brief on the Interconnect Issues;182
3.1.3;6.3 Memory Hierarchy and Hardware Compositions;182
3.1.3.1;6.3.1 The Types of Memories;182
3.1.3.2;6.3.2 The Memory Architectures;183
3.1.3.3;6.3.3 The Brief on a Memory Controller;185
3.1.3.4;6.3.4 The Memory Hierarchy;185
3.1.3.5;6.3.5 The Future of the Memory Hierarchy;187
3.1.3.6;6.3.6 The Device-Level Innovations;188
3.1.4;6.4 Software Interfaces of Memory Devices;189
3.1.4.1;6.4.1 The Memory as a System Resource;189
3.1.4.2;6.4.2 The Software Overhead in a Memory-related Performance;191
3.1.4.3;6.4.3 The Future Computing System;191
3.1.4.3.1;6.4.3.1 Evolution of a Memory System;191
3.1.4.3.2;6.4.3.2 Evolution of a Storage System;192
3.1.5;6.5 The Top-down Approach: Wrap-up;193
3.1.6;6.6 Conclusion;194
3.1.7;References;195
3.2;Chapter 7: Flash Memory;198
3.2.1;7.1 Introduction to Flash Memory;198
3.2.1.1;7.1.1 Introduction;198
3.2.1.2;7.1.2 Semiconductor Memory;199
3.2.1.2.1;Non-Volatile Memory;200
3.2.1.3;7.1.3 Flash Memory;200
3.2.2;7.2 Flash Memory Architecture;202
3.2.2.1;7.2.1 Chip Architecture;202
3.2.2.2;7.2.2 Basic Operating Principles of Flash-Cell;205
3.2.2.3;7.2.3 Memory Array Architecture;206
3.2.2.4;7.2.4 Program Operation;208
3.2.2.5;7.2.5 Erase Operation;211
3.2.2.6;7.2.6 Read Operation;214
3.2.3;7.3 MLC Technology;217
3.2.3.1;7.3.1 Concept of MLC Technology;217
3.2.3.2;7.3.2 Precise Charge Placement in MLC Technology;217
3.2.3.3;7.3.3 Precise Charge Sensing in MLC Technology;220
3.2.4;7.4 Flash Memory Reliability;223
3.2.4.1;7.4.1 Endurance;223
3.2.4.2;7.4.2 Data Retention;225
3.2.5;7.5 Flash Memory Scaling;227
3.2.5.1;7.5.1 Cell Scaling Issues;227
3.2.5.2;7.5.2 Alternative Method for High Density;230
3.2.6;7.6 Conclusions;231
3.2.7;References;231
3.3;Chapter 8: CMOS-based Spin-Transfer Torque Magnetic Random Access Memory (ST–MRAM);234
3.3.1;8.1 Introduction;235
3.3.2;8.2 CMOS-based ST–MRAM Elements;236
3.3.2.1;8.2.1 Background;236
3.3.2.2;8.2.2 Current Issues;240
3.3.3;8.3 Magnetization Dynamics in ST–MRAM Elements;241
3.3.3.1;8.3.1 Method of Micromagnetic Modeling;242
3.3.3.2;8.3.2 Spin-Polarized Current Pulse Switching of Ni80Fe20/Cu/Co Nanopillar Elements;242
3.3.3.3;8.3.3 Fabrication and Characterization of Prototype ST–MRAM;247
3.3.3.3.1;8.3.3.1 Fabrication of 8 × 8 Array of ST–MRAM Nanopillar Elements;247
3.3.3.3.2;8.3.3.2 Characterization of ST–MRAM Nanopillar Elements;247
3.3.4;8.4 Conclusions;250
3.3.5;References;251
3.4;Chapter 9: Magnetization Switching in Spin Torque Random Access Memory: Challenges and Opportunities;254
3.4.1;9.1 Introduction to Spin Torque Random Access Memory;255
3.4.2;9.2 Magnetization Switching Challenges as SPRAM Scales Down;256
3.4.3;9.3 SPRAM Device Characterization and System Scaling Down Requirement;260
3.4.3.1;9.3.1 Characterization of Spin Torque Induced Magnetization Switching;260
3.4.3.2;9.3.2 SPRAM System Dynamic Modeling;266
3.4.3.3;9.3.3 SPRAM Scale Down Requirments;269
3.4.4;9.4 Reasearch and Development Opportunities for Switching Current Reduction and Variability Control;273
3.4.4.1;9.4.1 Current Reduction Through Changing Magnetic Properties;273
3.4.4.2;9.4.2 Current Reduction Through Decreasing Damping by Changing Interfacial Tunneling Properties;275
3.4.4.3;9.4.3 Current Reduction Through Increasing Spin Torque Efficiency;278
3.4.4.4;9.4.4 Currnt Reduction Through Time and Spatial Varying Polarized Current;281
3.4.4.5;9.4.5 Current Reduction Through Coupled Magnetic Elements and Nonunifrom Magnetization Switching;284
3.4.4.6;9.4.6 Current Reduction Through Thermal Spin Torque Switching;286
3.4.4.7;9.4.7 Variability Control at Device Level;287
3.4.4.8;9.4.8 Variability Control at System Level;291
3.4.5;References;293
3.5;Chapter 10: High Performance Embedded Dynamic Random Access Memory in Nano-Scale Technologies;296
3.5.1;10.1 Introduction;296
3.5.2;10.2 Evolution for High Performance Embedded DRAMs;297
3.5.3;10.3 Principles of High Performance Embedded DRAM Technology, Architecture, and Designs;299
3.5.3.1;10.3.1 Technology;299
3.5.3.2;10.3.2 Macro Architecture;301
3.5.3.3;10.3.3 Modes of Operation;304
3.5.3.3.1;10.3.3.1 Single Bank Fast Random Access Cycle Mode;304
3.5.3.3.2;10.3.3.2 Multi Bank Pipeline Mode;306
3.5.3.4;10.3.4 Wordline Architectures;307
3.5.3.5;10.3.5 Bitline Architectures;309
3.5.3.6;10.3.6 Sensing Schemes;310
3.5.3.7;10.3.7 Late Write, Early Write, and Direct Write;312
3.5.3.8;10.3.8 Negative Wordline Architecture;312
3.5.3.9;10.3.9 Concurrent Refresh Mode;313
3.5.3.10;10.3.10 Redundancy;314
3.5.3.11;10.3.11 Test Methodology;316
3.5.4;10.4 IBM Embedded DRAM Macros;316
3.5.5;10.4.1 Embedded DRAMs for ASIC;317
3.5.5.1;10.4.2 Embedded DRAM with Destructive Read Architecture;319
3.5.5.2;10.4.3 Embedded DRAM for High-performance SOI Microprocessor;320
3.5.6;10.5 High Performance Cache with Embedded DRAM Macros;323
3.5.6.1;10.5.1 Architecture;323
3.5.6.2;10.5.2 ABIST and FAR;324
3.5.6.3;10.5.3 Refresh Management;325
3.5.7;10.6 Future Work;327
3.5.7.1;10.6.1 Embedded DRAM with Floating Body Cell;327
3.5.7.2;10.6.2 Embedded DRAM with Gain Cell;328
3.5.7.3;10.6.3 Embedded DRAM with Twin Cell;330
3.5.7.4;10.6.4 Embedded DRAM for 3 Dimensional Integration;331
3.5.7.5;10.6.5 Summary;333
3.5.8;References;334
3.6;Chapter 11: Timing Circuit Design in High Performance DRAM;338
3.6.1;11.1 Introduction;338
3.6.1.1;11.1.1 Memory Interface;338
3.6.1.2;11.1.2 Evolution of the DRAM Interface and Timing Specifications;339
3.6.1.3;11.1.3 Source-Synchronous Interface and Matched Routing;340
3.6.1.4;11.1.4 Timing Adjust Circuitry;341
3.6.2;11.2 Clock Distribution Network;342
3.6.2.1;11.2.1 CML Versus CMOS;342
3.6.2.2;11.2.2 Clock Division and Multiphase Clocking;344
3.6.2.3;11.2.3 Voltage and Temperature Insensitive CDN;345
3.6.2.4;11.2.4 Self-adaptive Bias Generator;347
3.6.2.5;11.2.5 Simulation Results;348
3.6.3;11.3 Clock Synchronization Circuits;349
3.6.3.1;11.3.1 MDLL Clocking Architecture;349
3.6.3.2;11.3.2 Fast-Lock Digital DLL;351
3.6.3.3;11.3.3 Analog Phase Generator (APG);353
3.6.3.4;11.3.4 Measurement Results;355
3.6.3.5;11.3.5 Other Consideration;357
3.6.4;11.4 Future Directions for Nanoscaled DRAM Interface;359
3.6.5;References;360
3.7;Chapter 12: Overview and Scaling Prospect of Ferroelectric Memories;362
3.7.1;12.1 Introduction;362
3.7.2;12.2 FeRAM Principle and Read/Write Mechanism;363
3.7.3;12.3 Conventional 1T/1C FeRAM and Current Memory Cell Structures;365
3.7.4;12.4 Chain FeRAM Architecture and Development History;366
3.7.5;12.5 Scaling Techniques to Reduce Bitline Capacitance;368
3.7.6;12.6 Dummy Cell Design Techniques;371
3.7.7;12.7 Cell Signal Enhancement Techniques;372
3.7.8;12.8 Reliability Issues;374
3.7.9;12.9 Future Prospect of FeRAMs – 3D Capacitor and New FeRAM;375
3.7.10;12.10 Application as Nonvolatile FeRAM Cache;377
3.7.11;12.11 Conclusions;379
3.7.12;References;379




