Neural Networks

A Concise Theoretical Foundation
Erscheinungsjahr 2027
ISBN: 978-0-443-51339-8
Verlag: Elsevier Science & Technology

Buch, Englisch, Format (B × H): 191 mm x 235 mm

A Concise Theoretical Foundation

Buch, Englisch, Format (B × H): 191 mm x 235 mm

ISBN: 978-0-443-51339-8
Verlag: Elsevier Science & Technology

Bücher versandkostenfrei

kostenlose Rücksendung

Neural Networks: A Concise Theoretical Foundation provides a comprehensive and structured overview of Neural Networks, covering basics as well as advanced topics. The book begins with basic concepts such as perceptrons, gradient-based learning, and optimization, then progresses through convolutional and recurrent networks, attention mechanisms, graph neural networks, and transformer-based models. Further sections explore generative models, reinforcement learning, biological perspectives, and the development of large-scale intelligent systems. The material is intended for students, researchers, and professionals who aim for both theoretical depth and practical understanding of modern Neural Network methods. This book was written to bring clarity and structure to the subject by presenting a unified framework that connects mathematical foundations with recent innovations. It aims to support learners and practitioners who want a single reference that explains both the fundamental ideas and the developments that have shaped current research and technology in artificial intelligence. Many readers find it difficult to connect primary principles with the wide variety of advanced cutting-edge Neural Network models used today. Concepts such as transformers, diffusion models, or biologically inspired systems can seem disconnected from basic learning algorithms. This book addresses that problem by organizing the material in a logical sequence that builds understanding step by step. It helps readers develop a clear and integrated view of how different techniques relate to each other and how they are applied in real-world systems.

Sarraf / Kucerovsky Neural Networks jetzt bestellen!

Autoren/Hrsg.

Sarraf, Aydin

Kucerovsky, Daniel Z.

Fachgebiete

Weitere Infos & Material

Inhaltsverzeichnis

Part I. Introduction
1. Foundations of Neural Networks

Part II. Multilayer Perceptrons
2.1 Activation Functions
2.2 Loss Functions
2.3 MLP as a Mathematical Function
2.4 Applications of MLPs

Part III. Training of Neural Networks
3.1 Optimization Fundamentals
3.1.1 Steepest Descent
3.1.2 Stein Variational Gradient Descent (SVGD)
3.1.3 Batch, Mini-batch, and Stochastic GD
3.1.4 Local Minima, Saddle Points, and Plateaus
3.1.5 Momentum
3.1.6 Gradient Clipping
3.1.7 Weight Initialization
3.1.8 Normalization
3.2 Gradient Computation and Challenges
3.2.1 Backpropagation
3.2.2 Automatic Differentiation
3.2.3 Vanishing and Exploding Gradients
3.3 Hyperparameter Tuning and Training Strategies
3.3.1 Hyperparameter Tuning
3.3.2 Learning Rate Scheduling
3.3.3 L1/L2 Regularization, Dropout, and DropConnect
3.3.4 Early Stopping and Cross-Validation
3.3.5 Data Augmentation, Bagging, Boosting, and Stacking
3.4 Advanced Learning Methods
3.4.1 Distributed and Federated Learning
3.4.2 Offline and Online Learning
3.4.3 TinyML and Energy Efficiency

Part IV. Expressivity and Generalization
4.1 Human vs. Machine Learning Mechanisms
4.2 Expressivity of Neural Networks
4.2.1 Kolmogorov-Arnold Representation Theorem
4.2.2 Universal Approximation Theorem
4.2.3 Neural Tangent Kernel Approximation Theorem
4.2.4 Neuromanifolds
4.3 Generalization of Neural Networks
4.3.1 Bias-Variance Trade-off
4.3.2 Double Descent Phenomenon
4.3.3 Curse of Dimensionality
4.3.4 No Free Lunch Theorem
4.3.5 Stability-Plasticity Dilemma
4.3.6 Implicit Regularization and Neural Collapse
4.3.7 Techniques to Prevent Overfitting
4.3.8 Complexity Measures
4.3.9 Learning Principles

Part V. Convolutional Neural Networks
5.1 Tensor
5.1.1 Tensor Operations
5.2 Convolution
5.2.1 Convolutional Layers
5.2.2 Pooling Layers
5.2.3 Normalization Layers
5.2.4 Convolution as Matrix Multiplication
5.2.5 Bounding Boxes
5.3 Architectural Designs
5.3.1 Manual Architecture Design
5.3.2 Automated Architecture Search
5.4 CNN as a Mathematical Function
5.5 Post-Hoc Explainability Methods for CNNs
5.6 Applications of CNNs

Part VI. Recurrent Neural Networks
6.1 Stochastic Processes 6.1.1 Single Stochastic Processes
6.1.2 Doubly Stochastic Processes
6.1.3 Hierarchical Stochastic Process
6.2 RNN as a Stochastic Process Approximator
6.3 Dynamical Systems 6.3.1 State Space Models
6.4 RNN as a Dynamical System
6.5 RNN as a Mathematical Function
6.6 Temporal Convolutional Neural Networks
6.7 Applications of RNNs

Part VII. Graph Neural Networks
7.1 Introduction to Graphs
7.2 Graph Traversal
7.3 Community Detection
7.4 Graph Neural Network Architectures
7.4.1 Graph Convolutional Networks (GCNs)
7.4.2 Graph Attention Networks (GATs)
7.4.3 Message Passing Neural Networks (MPNNs)
7.4.4 Weisfeiler-Lehman graph isomorphism
7.4.5 Graph Isomorphism Networks (GINs)
7.4.6 GraphSAGE
7.4.7 Infomax Principle
7.5 Challenges in Training Graph Neural Networks
7.6 Applications of GNNs

Part VIII. Generative Neural Networks
8.1 Introduction to Game Theory
8.2 Generative Adversarial Networks
8.2.1 Variants of GANs
8.2.2 Challenges in Training GANs
8.2.3 Evaluation of GANs
8.3 Autoencoders
8.3.1 Variants of Autoencoders
8.4 Auto-regressive Models
8.4.1 Variants of Autoregressive Models
8.5 Diffusion Models
8.5.1 Forward and Reverse Diffusion
8.5.2 Variants of Diffusion Models
8.6 Normalizing Flows
8.6.1 Variants of Normalizing Flows
8.7 Applications of Generative Models

Part IX. Deep Reinforcement Learning
9.1 Introduction to Decision Theory
9.2 Markov Decision Processes and Bellman Equations
9.2.1 Markov Decision Process (MDP)
9.2.2 Partially Observable Markov Decision Process (POMDP)
9.2.3 Bellman Equations
9.2.4 Dynamic Programming (DP)
9.2.5 Monte Carlo (MC) vs. Temporal Difference (TD)
9.2.6 Exploration vs. Exploitation
9.3 Model-Free vs. Model-Based
9.3.1 Model-Free Reinforcement Learning
9.3.2 Model-Based Reinforcement Learning
9.4 Specialized RL Methods
9.4.1 Multi-Agent RL
9.4.2 Hierarchical RL
9.4.3 Inverse RL
9.4.4 Offline RL
9.4.5 Causal RL
9.4.6 Behavioral Cloning, GRPO, and Imitation Learning
9.5 Challenges in Training RL Agents
9.6 Applications of RL

Part X. Attention-Based Neural Networks
10.1 Introduction to Attention and Transformers
10.1.1 Attention Mechanism
10.1.2 Transformers
10.2 Autoregressive Transformers
10.3 Bidirectional Transformers
10.4 Sequence-to-Sequence Transformers
10.4.1 T5 10.4.2 FLAN 10.5 Efficient Transformers
10.6 Sparse and Memory-Augmented Transformers
10.6.1 Transformer-XL
10.6.2 Octo Transformer
10.7 Mixture of Experts
10.8 Retrieval-Augmented Transformers
10.9 Vision and Multimodal Transformers
10.9.1 ViT, VLM, and CrossFormer
10.9.2 VLA, VLMA, and OpenVLA
10.9.3 CLIP and DINO
10.10Applications of Attention-Based Models

Part XI. Large Language Models and Agents
11.1 Introduction to LLMs
11.2 Training LLMs from Scratch
11.2.1 Data Curation and Tokenization
11.2.2 Model Architecture and Pre-training Objective
11.2.3 Training Infrastructure and Methodologies
11.3 Fine-Tuning and Adaptation Techniques
11.3.1 Alignment and Preference Optimization
11.3.2 Parameter-Efficient Fine-Tuning (PEFT)
11.4 Prompt Engineering
11.4.1 Prompting Techniques
11.4.2 Prompt Management and Optimization
11.5 Evaluation and Safety of LLMs
11.5.1 Performance, Robustness, and Failure Modes
11.5.2 Benchmarking and Testing
11.5.3 Safety and Security
11.6 Autonomous Agents and Multi-Agent Systems
11.6.1 Agent Formalism
11.6.2 Agent Robustness
11.6.3 Multi-Agent Communication and Coordination
11.6.4 Orchestration and Monitoring
11.7 RAG Architectures and Evaluation Metrics
11.7.1 RAG Architectures
11.7.2 Evaluation Metrics for RAG
11.8 Security and Safety in LLMs and Agentic Systems
11.8.1 Threats and Exploits in LLMs and Agentic Systems
11.8.2 Privacy, Governance, and Compliance
11.9 Applications of LLMs and Agents

Part XII. Continuous-Time Neural Networks
12.1 Differential Equations
12.1.1 Ordinary Differential Equations (ODEs)
12.1.2 Partial Differential Equations (PDEs)
12.1.3 Stochastic Differential Equations (SDEs)
12.1.4 Delay Differential Equations (DDEs)
12.1.5 Differential-Algebraic Equations (DAEs)
12.1.6 Fractional Differential Equations (FDEs)
12.2 Physics-Informed Neural Networks and Operators
12.2.1 Physics-Informed Neural Networks
12.2.2 Physics-Informed Neural Operators
12.3 Continuous-Time Neural Networks
12.3.1 Neural ODEs
12.3.2 Neural Controlled Differential Equations (Neural CDEs)
12.3.3 Continuous-Time Recurrent Neural Networks (CTRNNs)
12.3.4 Liquid Time-Constant (LTC) Networks
12.3.5 Hamiltonian Neural Networks

Part XIII. Biologically-Inspired Neural Networks
13.1 Attractor-Based Models
13.1.1 Hopfield Networks
13.1.2 Continuous Attractor Neural Networks (CANNs)
13.1.3 Continuous Hopfield Networks
13.2 Spiking-Based Models
13.2.1 Spiking Neural Networks (SNNs)
13.3 Plasticity-Based Models
13.3.1 BCM (Bienenstock–Cooper–Munro) Model
13.3.2 Adaptive Resonance Theory (ART)
13.4 Top-Down Feedback Models
13.4.1 Ladder Networks
13.4.2 Predictive Coding Networks
13.5 Topographic-Based Models
13.5.1 Self-Organizing Map (SOM)
13.6 Energy-Based Models
13.6.1 Restricted Boltzmann Machines (RBMs)
13.6.2 Deep Belief Networks (DBNs)
13.7 Memory-Based Models
13.7.1 Neural Turing Machines (NTMs)
13.7.2 Differentiable Neural Computers (DNCs)

Part XIV. Bayesian Neural Networks
14.1 Introduction to Bayesian Neural Networks
14.1.1 Gaussian Processes
14.1.2 Bayesian Neural Network Modeling
14.2 Approximate Inference and Training Strategies
14.2.1 Variational Inference
14.2.2 Monte Carlo Sampling Techniques
14.2.3 Practical Challenges and Applications

Part XV. Quantum Neural Networks
15.1 Introduction to Quantum Theory
15.1.1 Quantum States and Dirac’s Bra-Ket Notation
15.1.2 Wave-Particle Duality and Experimental Evidence
15.1.3 Quantization of Energy and the Photoelectric Effect
15.1.4 Quantum Superposition and Measurement
15.1.5 Discrete Energy Levels and the Hydrogen Atom 15.1.6 Heisenberg Uncertainty Principle
15.1.7 Observables, Operators, and Measurement Postulate
15.1.8 Commutation Relations and Incompatibility
15.1.9 Schr¨odinger Equation and Time Evolution
15.1.10 Collapse of the Wave Function
15.1.11 Quantum Entanglement
15.1.12 Quantum Tunneling 15.1.13 Spin and Intrinsic Angular Momentum
15.1.14 Pauli Exclusion Principle
15.1.15 Bell’s Theorem and Quantum Nonlocality
15.1.16 Quantum Field Theory
15.2 Quantum Computing
15.2.1 Quibits and Quantum Gates
15.2.2 Quantum Algorithms
15.2.3 Quantum Circuits
15.3 Quantum Neural Networks
15.3.1 Variational Quantum Circuits (VQCs)
15.3.2 Quantum Circuit Learning (QCL)
15.3.3 Training Quantum Neural Networks

Part XVI. Conclusion

Über Autor(innen)

Kucerovsky, Daniel Z.
Dr. Daniel Z. Kucerovsky, has a PhD from University of Oxford. He is a Full Professor in the Department of Mathematics and Statistics at the University of New Brunswick, Canada. He has more than 60 publications in top mathematical journals. His research is focused on C*-algebras, and he introduced and developed the now well-known CFP property of C*-algebras. He has also applied C*-algebraic methods to establish the now popular unbounded Kasparov product approach to proving index theorems. In other interests, Dr. Kucerovsky has worked with bi-algebras and also with corona algebras, which are a particular kind of very large C*-algebra in which various constructions based on large cardinal numbers and/or unusual topological spaces allow surprising results to be proven.

Sarraf, Aydin
Dr. Aydin Sarraf is Principal Data Scientist at Ericsson, and has more than ten years of experience as a data scientist. He has filed 10 patents and published 17 papers in AI/ML and mathematics. He has hands-on experience with implementation and training of neural networks from classical MLPs to modern LLMs.

Produktsicherheit

Fragen zum Artikel?

Ihre Fragen, Wünsche oder Anmerkungen

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Kundennr.

Ihre Nachricht*

Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.

Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.

Preisanfrage

vorbestellbar, Erscheinungstermin ca. Januar 2027

Webcode: sack.de/8lqqe