Demystifying Deep Learning

An Introduction to the Mathematics of Neural Networks
1. Auflage 2023
ISBN: 978-1-394-20560-8
Verlag: Turner Publishing Company

Buch, Englisch, 256 Seiten, Format (B × H): 157 mm x 235 mm, Gewicht: 531 g

An Introduction to the Mathematics of Neural Networks

Buch, Englisch, 256 Seiten, Format (B × H): 157 mm x 235 mm, Gewicht: 531 g

ISBN: 978-1-394-20560-8
Verlag: Turner Publishing Company

129,50 €

(inkl. MwSt.)

versandkostenfreie Lieferung
Lieferfrist: bis zu 10 Tage

Bücher versandkostenfrei

kostenlose Rücksendung

Discover how to train Deep Learning models by learning how to build real Deep Learning software libraries and verification software!

The study of Deep Learning and Artificial Neural Networks (ANN) is a significant subfield of artificial intelligence (AI) that can be found within numerous fields: medicine, law, financial service, and science, for example. Just as the robot revolution threatened blue-collar jobs in the 1970s, so now the AI revolution promises a new era of productivity for white collar jobs. Important tasks have begun being taken over by ANNs, from disease detection and prevention to reading and supporting legal contracts, to understanding experimental data, model protein folding, and hurricane modeling. AI is everywhere--on the news, in think tanks, and occupies government policy makers all over the world --and ANNs often provide the backbone for AI.

Relying on an informal and succinct approach, Demystifying Deep Learning is a useful tool to learn the necessary steps to implement ANN algorithms by using both a software library applying neural network training and verification software. The volume offers explanations of how real ANNs work, and includes 6 practical examples that demonstrate in real code how to build ANNS and the datasets they need in their implementation, available in open-source to ensure practical usage. This approachable book follows ANN techniques that are used every day as they adapt to natural language processing, image recognition, problem solving, and generative applications. This volume is an important introduction to the field equipping the reader for more advanced study.

Demystifying Deep Learning readers will also find:

* A volume that emphasizes the importance of classification
* Discussion of why ANN libraries (such as Tensor Flow and Pytorch) are written in C++ rather than Python
* Each chapter concludes with a "Projects" page to promote students experimenting with real code
* A supporting library of software to accompany the book at https://github.com/nom-de-guerre/RANT
* Approachable explanation of how generative AI, such as generative adversarial networks (GAN) really work.
* An accessible motivation and elucidation of how transformers, the basis of large language models (LLM) such as ChatGPT, work.

Demystifying Deep Learning is ideal for engineers and professionals that need to learn and understand ANNs in their work. It is also a helpful text for advanced undergraduates to get a solid grounding on the topic.

Santry Demystifying Deep Learning jetzt bestellen!

Autoren/Hrsg.

Santry, Douglas J

Fachgebiete

Weitere Infos & Material

Inhaltsverzeichnis

About the Author

Acronyms

1 Introduction 1

1.1 AI/ML - Deep Learning? 7

1.2 A Brief History 9

1.3 The Genesis of Models 14

1.3.1 Rise of the Empirical Functions 15

1.3.2 The Biological Phenomenon and the Analogue 20

1.4 Numerical Computation - Computer Numbers Are Not Real 21

1.4.1 The IEEE 754 Floating Point System 24

1.4.2 Numerical Coding Tip: Think in Floating Point 28

1.5 Summary 31

1.6 projects 32

2 Deep Learning and Neural Networks 33

2.1 Feed-Forward and Fully-Connected Artificial Neural Networks 34

2.2 Computing Neuron State 41

2.2.1 Activation Functions 42

2.3 The Feed-Forward ANN Expressed with Matrices 45

2.3.1 Neural Matrices: A Convenient Notation 47

2.4 Classification 48

2.4.1 Binary Classification 50

2.4.2 One-Hot Encoding 52

2.4.3 The Softmax Layer 54

2.5 Summary 57

2.6 Projects 58

3 Training Neural Networks 59

3.1 Preparing the Training Set: Data Preprocessing 60

3.2 Weight Initialization 66

3.3 Training Outline 68

3.4 Least Squares: A Trivial Example 71

3.5 Backpropagation of Error for Regression 75

3.5.1 The Terminal Layer (Output) 80

3.5.2 Backpropagation: The Shallower Layers 84

3.5.3 The Complete BackPropagation Algorithm 90

3.5.4 A Word on the Rectified Linear Unit (ReLU) 92

3.6 Stochastic Sine 95

3.7 Verification of a Software Implementation 98

3.8 Summary 105

3.9 Projects 106

4 Training Classifiers 107

4.1 Backpropagation for Classifiers 107

4.1.1 Likelihood 108

4.1.2 Categorical Loss Functions 111

4.2 Computing the Derivative of the Loss 114

4.2.1 Initiate Backpropagation 118

4.3 Multilabel Classification 120

4.3.1 Binary Classification 121

4.3.2 Training A Multilabel Classifier ANN 122

4.4 Summary 124

4.5 Projects 124

5 Weight Update Strategies 127

5.1 Stochastic Gradient Descent 128

5.2 Weight Updates as Iteration and Convex Optimization 134

5.2.1 Newton's Method for Optimization 137

5.3 RPROP+ 141

5.4 Momentum Methods 144

5.4.1 AdaGrad and RMSProp 145

5.4.2 ADAM 147

5.5 Levenberg-Marquard Optimization for Neural Networks 151

5.6 Summary 159

5.7 Projects 159

6 Convolutional Neural Networks 161

6.1 Motivation 162

6.2 Convolutions and Features 164

6.3 Filters 170

6.4 Pooling 174

6.5 Feature Layers 175

6.6 Training a CNN 179

6.6.1 Flatten and the Gradient 180

6.6.2 Pooling and the Gradient 181

6.6.3 Filters and the Gradient 183

6.7 Applications 189

6.8 Summary 190

6.9 Projects 190

7 Fixing the Fit 193

7.1 Quality of the Solution 193

7.2 Generalization Error 194

7.2.1 Bias 195

7.2.2 Variance 197

7.2.3 The Bias-Variance Tradeoff 198

7.2.4 The Bias-Variance Tradeoff in Context 199

7.2.5 The Test Set 201

7.3 Classification Performance 204

7.4 Regularization 209

7.4.1 Forward Pass During Training 210

7.4.2 Forward Pass During Normal Inference 211

7.4.3 Backpropagation of Error 212

7.5 Advanced Normalization 216

7.5.1 Batch Normalization 217

7.5.2 Layer Normalization 223

7.6 Summary 228

7.7 Projects 228

8 Design Principles for a Deep Learning Training Library 231

8.1 Computer Languages 232

8.2 The Matrix: Crux of a Library Implementation 239

8.2.1 Memory Access and Modern CPU Architectures 241

8.2.2 Designing Matrix Computations 246

8.3 The Framework 251

8.4 Summary 255

8.5 Projects 255

9 Vistas 257

9.1 The Limits of ANN Learning Capacity 257

9.2 Generative Adversarial Networks 260

9.2.1 GAN Architecture 262

9.2.2 The GAN Loss Function 265

9.3 Reinforcement Learning 269

9.3.1 The Elements of Reinforcement Learning 272

9.3.2 A Trivial RL Training Algorithm 275

9.4 Natural Language Processing Transformed 284

9.4.1 The Challenges of Natural Language 285

9.4.2 Word Embeddings 287

9.4.3 Attention 291

9.4.4 Transformer Blocks 294

9.4.5 Multi-Head Attention 300

9.4.6 Transformer Applications 303

9.5 Neural Turing Machines 305

9.6 Summary 309

9.7 Projects 309

A Mathematical Review 311

A.1 Linear Algebra 311

A.1.1 Vectors 311

A.1.2 Matrices 313

A.1.3 Matrix Properties 316

A.1.4 Linear Independence 317

A.1.5 The QR Decomposition 317

A.1.6 Least Squares 318

A.1.7 Eigenvalues and Eigenvectors 319

A.1.8 Hadamard Operations 319

A.2 Basic Calculus 320

A.2.1 The Product Rule 321

A.2.2 The Chain Rule 322

A.2.3 Multi-Variable Functions 322

A.2.4 Taylor Series 323

A.3 Advanced Matrices 324

A.4 Probability 324

B Glossary 327

Bibliography 336

Index 363

Über Autor(innen)

Douglas Santry, PhD, is a lecturer in Computer Science at the University of Kent, UK. Dr. Santry obtained his PhD from the University of Cambridge. Prior to his current position, he worked extenstively as an important figure in industry with Apple Computer Corp, NetApp and Goldman Sachs.

Fragen zum Artikel?

Ihre Fragen, Wünsche oder Anmerkungen

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Kundennr.

Ihre Nachricht*

Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.

Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.

129,50 € (inkl. MwSt.)

Lieferfrist: bis zu 10 Tage

Bücher versandkostenfrei

kostenlose Rücksendung

Webcode: sack.de/hhh93