E-Book, Englisch, 241 Seiten
Mohamed Neuromorphic Computing and Beyond
1. Auflage 2020
ISBN: 978-3-030-37224-8
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
Parallel, Approximation, Near Memory, and Quantum
E-Book, Englisch, 241 Seiten
ISBN: 978-3-030-37224-8
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book discusses and compares several new trends that can be used to overcome Moore's law limitations, including Neuromorphic, Approximate, Parallel, In Memory, and Quantum Computing. The author shows how these paradigms are used to enhance computing capability as developers face the practical and physical limitations of scaling, while the demand for computing power keeps increasing. The discussion includes a state-of-the-art overview and the essential details of each of these paradigms.
Khaled Salah Mohamed attended the school of engineering, Department of Electronics and Communications at Ain-Shams University from 1998 to 2003, where he received his B.Sc. degree in Electronics and Communications Engineering with distinction and honors. He received his Masters degree in Electronics from Cairo University, Egypt in 2008. He received his PhD degree in 2012. Dr. Khaled Salah is currently a Technical Lead at the Emulation division at Mentor Graphic, Egypt. Dr. Khaled Salah has published a large number of papers in in the top refereed journals and conferences. His research interests are in 3D integration, IP Modeling, and SoC design.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;6
2;Contents;8
3;Chapter 1: An Introduction: New Trends in Computing;14
3.1;1.1 Introduction;14
3.1.1;1.1.1 Power Wall;15
3.1.2;1.1.2 Frequency Wall;16
3.1.3;1.1.3 Memory Wall;16
3.2;1.2 Classical Computing;16
3.2.1;1.2.1 Classical Computing Generations;17
3.2.2;1.2.2 Types of Computers;18
3.3;1.3 Computers Architectures;19
3.3.1;1.3.1 Instruction Set Architecture (ISA);19
3.3.2;1.3.2 Different Computer Architecture;21
3.3.2.1;1.3.2.1 Von-Neumann Architecture: General-Purpose Processors;21
3.3.2.2;1.3.2.2 Harvard Architecture;23
3.3.2.3;1.3.2.3 Modified Harvard Architecture;23
3.3.2.4;1.3.2.4 Superscalar Architecture: Parallel Architecture;23
3.3.2.5;1.3.2.5 VLIW Architecture: Parallel Architecture;24
3.4;1.4 New Trends in Computing;25
3.5;1.5 Conclusions;26
3.6;References;26
4;Chapter 2: Numerical Computing;27
4.1;2.1 Introduction;27
4.2;2.2 Numerical Analysis for Electronics;28
4.2.1;2.2.1 Why EDA;28
4.2.2;2.2.2 Applications of Numerical Analysis;30
4.2.3;2.2.3 Approximation Theory;31
4.3;2.3 Different Methods for Solving PDEs and ODEs;32
4.3.1;2.3.1 Iterative Methods for Solving PDEs and ODEs;34
4.3.1.1;2.3.1.1 Finite Difference Method (Discretization);34
4.3.1.2;2.3.1.2 Finite Element Method (Discretization);34
4.3.1.3;2.3.1.3 Legendre Polynomials;35
4.3.2;2.3.2 Hybrid Methods for Solving PDEs and ODEs;36
4.3.3;2.3.3 ML-Based Methods for Solving ODEs and PDEs;36
4.3.4;2.3.4 How to Choose a Method for Solving PDEs and ODEs;37
4.4;2.4 Different Methods for Solving SNLEs;38
4.4.1;2.4.1 Iterative Methods for Solving SNLEs;39
4.4.1.1;2.4.1.1 Newton Method and Newton–Raphson Method;39
4.4.1.2;2.4.1.2 Quasi-Newton Method aka Broyden’s Method;42
4.4.1.3;2.4.1.3 The Secant Method;45
4.4.1.4;2.4.1.4 The Muller Method;46
4.4.2;2.4.2 Hybrid Methods for Solving SNLEs;47
4.4.3;2.4.3 ML-Based Methods for Solving SNLEs;47
4.4.4;2.4.4 How to Choose a Method for Solving Nonlinear Equations;47
4.5;2.5 Different Methods for Solving SLEs;48
4.5.1;2.5.1 Direct Methods for Solving SLEs;49
4.5.1.1;2.5.1.1 Cramer’s Rule Method;49
4.5.1.2;2.5.1.2 Gaussian Elimination Method;50
4.5.1.3;2.5.1.3 Gauss–Jordan (GJ) Elimination Method;53
4.5.1.4;2.5.1.4 LU Decomposition Method;54
4.5.1.5;2.5.1.5 Cholesky Decomposition Method;55
4.5.2;2.5.2 Iterative Methods for Solving SLEs;55
4.5.2.1;2.5.2.1 Jacobi Method;56
4.5.2.2;2.5.2.2 Gauss–Seidel Method;57
4.5.2.3;2.5.2.3 Successive Over-Relaxation (SOR) Method;57
4.5.2.4;2.5.2.4 Conjugate Gradient Method;58
4.5.2.5;2.5.2.5 Bi-conjugate Gradient Method;59
4.5.2.6;2.5.2.6 Generalized Minimal Residual Method;60
4.5.3;2.5.3 Hybrid Methods for Solving SLEs;60
4.5.4;2.5.4 ML-Based Methods for Solving SLEs;61
4.5.5;2.5.5 How to Choose a Method for Solving Linear Equations;61
4.6;2.6 Common Hardware Architecture for Different Numerical Solver Methods;62
4.7;2.7 Software Implementation for Different Numerical Solver Methods;65
4.7.1;2.7.1 Cramer’s Rule: Python-Implementation;65
4.7.2;2.7.2 Newton–Raphson: C-Implementation;66
4.7.3;2.7.3 Gauss Elimination: Python-Implementation;67
4.7.4;2.7.4 Conjugate Gradient: MATLAB-Implementation;68
4.7.5;2.7.5 GMRES: MATLAB-Implementation;69
4.7.6;2.7.6 Cholesky: MATLAB-Implementation;70
4.8;2.8 Conclusions;71
4.9;References;71
5;Chapter 3: Parallel Computing: OpenMP, MPI, and CUDA;74
5.1;3.1 Introduction;74
5.1.1;3.1.1 Concepts;75
5.1.2;3.1.2 Category of Processors: Flynn’s Taxonomy/Classification (1966);76
5.1.2.1;3.1.2.1 Von-Neumann Architecture (SISD);76
5.1.2.2;3.1.2.2 SIMD;77
5.1.2.3;3.1.2.3 MISD;78
5.1.2.4;3.1.2.4 MIMD;79
5.1.3;3.1.3 Category of Processors: Soft/Hard/Firm;80
5.1.4;3.1.4 Memory: Shared-Memory vs. Distributed Memory;80
5.1.5;3.1.5 Interconnects: Between Processors and Memory;83
5.1.6;3.1.6 Parallel Computing: Pros and Cons;83
5.2;3.2 Parallel Computing: Programming;84
5.2.1;3.2.1 Typical Steps for Constructing a Parallel Algorithm;84
5.2.2;3.2.2 Levels of Parallelism;85
5.2.2.1;3.2.2.1 Processor: Architecture Point of View;85
5.2.2.2;3.2.2.2 Programmer Point of View;85
5.3;3.3 Open Specifications for Multiprocessing (OpenMP) for Shared Memory;86
5.4;3.4 Message-Passing Interface (MPI) for Distributed Memory;88
5.5;3.5 GPU;89
5.5.1;3.5.1 GPU Introduction;89
5.5.2;3.5.2 GPGPU;90
5.5.3;3.5.3 GPU Programming;91
5.5.3.1;3.5.3.1 CUDA;91
5.5.4;3.5.4 GPU Hardware;94
5.5.4.1;3.5.4.1 The Parallella Board;94
5.6;3.6 Parallel Computing: Overheads;94
5.7;3.7 Parallel Computing: Performance;95
5.8;3.8 New Trends in Parallel Computing;101
5.8.1;3.8.1 3D Processors;101
5.8.2;3.8.2 Network on Chip;102
5.8.3;3.8.3 FCUDA;103
5.9;3.9 Conclusions;103
5.10;References;103
6;Chapter 4: Deep Learning and Cognitive Computing: Pillars and Ladders;105
6.1;4.1 Introduction;105
6.1.1;4.1.1 Artificial Intelligence;105
6.1.2;4.1.2 Machine Learning;107
6.1.2.1;4.1.2.1 Supervised Machine Learning;109
6.1.2.2;4.1.2.2 Unsupervised Machine Learning;110
6.1.2.3;4.1.2.3 Reinforcement Machine Learning;111
6.1.3;4.1.3 Neural Network and Deep Learning;112
6.2;4.2 Deep Learning: Basics;114
6.2.1;4.2.1 DL: What? Deep vs. Shallow;114
6.2.2;4.2.2 DL: Why? Applications;116
6.2.3;4.2.3 DL: How?;116
6.2.4;4.2.4 DL: Frameworks and Tools;121
6.2.4.1;4.2.4.1 TensorFlow;122
6.2.4.2;4.2.4.2 Keras;123
6.2.4.3;4.2.4.3 PyTorch;123
6.2.4.4;4.2.4.4 OpenCV;124
6.2.4.5;4.2.4.5 Others;125
6.2.5;4.2.5 DL: Hardware;125
6.3;4.3 Deep Learning: Different Models;125
6.3.1;4.3.1 Feedforward Neural Network;125
6.3.1.1;4.3.1.1 Single-Layer Perceptron(SLP);127
6.3.1.2;4.3.1.2 Multilayer Perceptron (MLP);128
6.3.1.3;4.3.1.3 Radial Basis Function Neural Network;129
6.3.2;4.3.2 Recurrent Neural Network (RNNs);129
6.3.2.1;4.3.2.1 LSTMs;130
6.3.2.2;4.3.2.2 GRUs;131
6.3.3;4.3.3 Convolutional Neural Network (CNNs): Feedforward;131
6.3.4;4.3.4 Generative Adversarial Network (GAN);135
6.3.5;4.3.5 Auto Encoders Neural Network;136
6.3.6;4.3.6 Spiking Neural Network;138
6.3.7;4.3.7 Other Types of Neural Network;139
6.3.7.1;4.3.7.1 Hopfield Networks;139
6.3.7.2;4.3.7.2 Boltzmann Machine;141
6.3.7.3;4.3.7.3 Restricted Boltzmann Machine;141
6.3.7.4;4.3.7.4 Deep Belief Network;141
6.3.7.5;4.3.7.5 Associative NN;141
6.4;4.4 Challenges for Deep Learning;141
6.4.1;4.4.1 Overfitting;141
6.4.2;4.4.2 Underfitting;142
6.5;4.5 Advances in Neuromorphic Computing;142
6.5.1;4.5.1 Transfer Learning;142
6.5.2;4.5.2 Quantum Machine Learning;144
6.6;4.6 Applications of Deep Learning;144
6.6.1;4.6.1 Object Detection;144
6.6.2;4.6.2 Visual Tracking;148
6.6.3;4.6.3 Natural Language Processing;148
6.6.4;4.6.4 Digits Recognition;149
6.6.5;4.6.5 Emotions Recognition;149
6.6.6;4.6.6 Gesture Recognition;149
6.6.7;4.6.7 Machine Learning for Communications;150
6.7;4.7 Cognitive Computing: An Introduction;150
6.8;4.8 Conclusions;152
6.9;References;152
7;Chapter 5: Approximate Computing: Towards Ultra-Low-Power Systems Design;156
7.1;5.1 Introduction;156
7.2;5.2 Hardware-Level Approximation Techniques;158
7.2.1;5.2.1 Transistor-Level Approximations;158
7.2.2;5.2.2 Circuit-Level Approximations;159
7.2.3;5.2.3 Gate-Level Approximations;160
7.2.3.1;5.2.3.1 Approximate Multiplier Using Approximate Computing;160
7.2.3.2;5.2.3.2 Approximate Multiplier Using Stochastic/Probabilistic Computing;160
7.2.4;5.2.4 RTL-Level Approximations;161
7.2.4.1;5.2.4.1 Iterative Algorithms;161
7.2.5;5.2.5 Algorithm-Level Approximations;162
7.2.5.1;5.2.5.1 Iterative Algorithms;162
7.2.5.2;5.2.5.2 High-Level Synthesis (HLS) Approximations;162
7.2.6;5.2.6 Device-Level Approximations: Memristor-Based Approximate Matrix Multiplier;164
7.3;5.3 Software-Level Approximation Techniques;164
7.3.1;5.3.1 Loop Perforation;164
7.3.2;5.3.2 Precision Scaling;165
7.3.3;5.3.3 Synchronization Elision;165
7.4;5.4 Data-Level Approximation Techniques;165
7.4.1;5.4.1 STT-MRAM;165
7.4.2;5.4.2 Processing in Memory (PIM);165
7.4.3;5.4.3 Lossy Compression;166
7.5;5.5 Evaluation: Case Studies;166
7.5.1;5.5.1 Image Processing as a Case Study;166
7.5.2;5.5.2 CORDIC Algorithm as a Case Study;166
7.5.3;5.5.3 HEVC Algorithm as a Case Study;169
7.5.4;5.5.4 Software-Based Fault Tolerance Approximation;171
7.6;5.6 Conclusions;171
7.7;References;172
8;Chapter 6: Near-Memory/In-Memory Computing: Pillars and Ladders;175
8.1;6.1 Introduction;175
8.2;6.2 Classical Computing: Processor-Centric Approach;176
8.3;6.3 Near-Memory Computing: Data-Centric Approach;178
8.3.1;6.3.1 HMC;179
8.3.2;6.3.2 WideIO;181
8.3.3;6.3.3 HBM;181
8.4;6.4 In-Memory Computing: Data-Centric Approach;181
8.4.1;6.4.1 Memristor-Based PIM;181
8.4.2;6.4.2 PCM-Based PIM;182
8.4.3;6.4.3 ReRAM-Based PIM;184
8.4.4;6.4.4 STT-RAM-Based PIM;185
8.4.5;6.4.5 FeRAM-Based PIM;185
8.4.6;6.4.6 NRAM-Based PIM;186
8.4.7;6.4.7 Comparison Between Different New Memories;187
8.5;6.5 Techniques to Enhance DRAM Memory Controllers;187
8.5.1;6.5.1 Techniques to Overcome the DRAM-Wall;189
8.5.1.1;6.5.1.1 Low-Power Techniques in DRAM Interfaces;189
8.5.1.2;6.5.1.2 High-Bandwidth and Low Latency Techniques in DRAM Interfaces;191
8.5.1.3;6.5.1.3 High-Capacity and Small Footprint Techniques in DRAM Interfaces;192
8.6;6.6 Conclusions;192
8.7;References;193
9;Chapter 7: Quantum Computing and DNA Computing: Beyond Conventional Approaches;195
9.1;7.1 Introduction: Beyond CMOS;195
9.2;7.2 Quantum Computing;195
9.2.1;7.2.1 Quantum Computing: History;197
9.2.2;7.2.2 Quantum Computing: What?;198
9.2.3;7.2.3 Quantum Computing: Why?;198
9.2.4;7.2.4 Quantum Computing: How?;198
9.3;7.3 Quantum Principles;200
9.3.1;7.3.1 Bits Versus Qbits;200
9.3.2;7.3.2 Quantum Uncertainty;201
9.3.3;7.3.3 Quantum Superposition;201
9.3.4;7.3.4 Quantum Entanglement (Nonlocality);202
9.4;7.4 Quantum Challenges;202
9.5;7.5 DNA Computing: From Bits to Cells;202
9.5.1;7.5.1 What Is DNA?;202
9.5.2;7.5.2 Why DNA Computing?;202
9.5.3;7.5.3 How DNA Works?;203
9.5.4;7.5.4 Disadvantages of DNA Computing;204
9.5.5;7.5.5 Traveling Salesman Problem Using DNA-Computing;204
9.6;7.6 Conclusions;205
9.7;References;206
10;Chapter 8: Cloud, Fog, and Edge Computing;207
10.1;8.1 Cloud Computing;207
10.2;8.2 Fog/Edge Computing;210
10.3;8.3 Conclusions;213
10.4;References;216
11;Chapter 9: Reconfigurable and Heterogeneous Computing;217
11.1;9.1 Embedded Computing;217
11.1.1;9.1.1 Categories of Embedded Systems Are [2–5];217
11.1.2;9.1.2 Embedded System Classifications;218
11.1.3;9.1.3 Components of Embedded Systems;218
11.1.4;9.1.4 Microprocessor vs. Microcontroller;219
11.1.5;9.1.5 Embedded Systems Programming;220
11.1.6;9.1.6 DSP;220
11.2;9.2 Real-Time Computing;222
11.3;9.3 Reconfigurable Computing;223
11.3.1;9.3.1 FPGA;224
11.3.2;9.3.2 High-Level Synthesis (C/C++ to RTL);225
11.3.3;9.3.3 High-Level Synthesis (Python to HDL);226
11.3.4;9.3.4 MATLAB to HDL;227
11.3.5;9.3.5 Java to VHDL;227
11.4;9.4 Heterogeneous Computing;228
11.4.1;9.4.1 Heterogeneity vs. Homogeneity;229
11.4.2;9.4.2 Pollack’s Rule;229
11.4.3;9.4.3 Static vs. Dynamic Partitioning;230
11.4.4;9.4.4 Heterogeneous Computing Programming;230
11.4.4.1;9.4.4.1 Heterogeneous Computing Programming: OpenCL;231
11.5;9.5 Conclusions;231
11.6;References;231
12;Chapter 10: Conclusions;233
13;Index;235




