Buch, Englisch, 400 Seiten
Mechanisms, Vulnerabilities, and Strategies for Trustworthy AI
Buch, Englisch, 400 Seiten
ISBN: 978-1-394-40203-8
Verlag: John Wiley & Sons Inc
Enables readers to understand the full lifecycle of adversarial machine learning (AML) and how AI models can be compromised
Adversarial Machine Learning is a definitive guide to one of the most urgent challenges in artificial intelligence today: how to secure machine learning systems against adversarial threats.
This book explores the full lifecycle of adversarial machine learning (AML), providing a structured, real-world understanding of how AI models can be compromised—and what can be done about it.
The book walks readers through the different phases of the machine learning pipeline, showing how attacks emerge during training, deployment, and inference. It breaks down adversarial threats into clear categories based on attacker goals—whether to disrupt system availability, tamper with outputs, or leak private information. With clarity and technical rigor, it dissects the tools, knowledge, and access attackers need to exploit AI systems.
In addition to diagnosing threats, the book provides a robust overview of defense strategies—from adversarial training and certified defenses to privacy-preserving machine learning and risk-aware system design. Each defense is discussed alongside its limitations, trade-offs, and real-world applicability.
Readers will gain a comprehensive view of today???s most dangerous attack methods including: - Evasion attacks that manipulate inputs to deceive AI predictions
- Poisoning attacks that corrupt training data or model updates
- Backdoor and trojan attacks that embed malicious triggers
- Privacy attacks that reveal sensitive data through model interaction and prompt injection
- Generative AI attacks that exploit the new wave of large language models
Blending technical depth with practical insight, Adversarial Machine Learning equips developers, security engineers, and AI decision-makers with the knowledge they need to understand the adversarial landscape and defend their systems with confidence.
Autoren/Hrsg.
Fachgebiete
Weitere Infos & Material
Preface xi
Acknowledgments xiii
From the Author xv
Introduction xvii
About the Companion Website xxi
1 The Age of Intelligent Threats 1
The Rise of AI as a Security Target 1
Fragility in Intelligent Systems 3
Categories of AI: Predictive, Generative, and Agentic 5
Milestones in Adversarial Vulnerability 8
Intelligence as an Attack Multiplier 10
Why This Book and Who It's For 12
2 Anatomy of AI Systems and Their Attack Surfaces 21
The Architecture of Predictive, Generative, and Agentic AI 21
The AI Development Lifecycle: From Data to Deployment 24
Classical Machine Learning vs. Modern AI Pipelines 26
Identifying Entry Points: Training, Inference, and Supply Chain 28
Security Debt in the Model Development Lifecycle 31
3 The Adversary's Playbook 39
Threat Actors: Profiles, Motivations, and Objectives 39
White-Box Attack Techniques and Methodologies 41
Black-Box Attack Techniques and Methodologies 44
Gray-Box Attack Techniques and Methodologies 47
Operationalizing AI Attacks: Tactical Methodologies and Execution 49
Advanced Multi-Stage and Coordinated AI Attacks 52
4 Evasion Attacks—Tricking AI Models at Inference 61
Core Principles and Mechanisms of Evasion Attacks 61
Gradient-Based Evasion Techniques 64
Linguistic and Textual Evasion Methods 67
Image- and Vision-Based Evasion Techniques 69
Evasion Attacks on Time-Series and Sequential Models 72
5 Poisoning Attacks—Compromising AI Systems During Training 81
Fundamentals and Mechanisms of Training-Time Poisoning 81
Label Manipulation and Clean-Label Poisoning Techniques 84
Backdoor and Trojan Insertion in Training Data 86
Poisoning Attacks on Federated and Distributed Learning Systems 89
Poisoning Attacks Against Reinforcement Learning (RL) Systems 91
Poisoning Attacks on Transfer Learning and Fine-Tuning Processes 94
6 Privacy Attacks—Extracting Secrets from AI Models 103
Core Mechanisms and Objectives of AI Privacy Attacks 103
Membership Inference Techniques 106
Model Inversion Attacks and Data Reconstruction 109
Attribute and Property Inference Attacks 111
Model Extraction and Functionality Reconstruction 114
Exploiting Privacy Leakage Through Prompting Generative AI 117
7 Backdoor and Trojan Attacks—Embedding Hidden Behaviors in AI Models 125
Fundamental Concepts of AI Backdoors and Trojans 125
Backdoor Trigger Design and Optimization 128
Data Poisoning Methods for Backdoor Embedding 130
Trojan Attacks in Transfer and Fine-Tuning Scenarios 132
Embedding Backdoors in Federated and Decentralized Training 135
Advanced Trigger Embedding in Generative and Agentic AI Models 137
8 The Generative AI Attack Surface 147
Architectural Foundations of Large Language Models 147
How Generative Architectures Expand Attack Opportunities 150
Exploiting Fine-Tuning as an Adversarial Vector 152
Prompt Engineering as an Adversarial Exploitation Pathway 155
Technical Risks in Retrieval-Augmented Generation Systems 157
Leveraging Model Internals for Generative AI Exploitation 160
9 Prompt Injection and Jailbreak Techniques 169
Technical Foundations of Prompt Injection Attacks 169
Direct Prompt Injection Methods and Input Crafting 173
Indirect Prompt Injection via External or Retrieved Content 175
Jailbreak Techniques and Semantic Boundary Exploitation 177
Token-Level and Embedding Space Manipulations 180
Contextual and Conversational Injection Strategies 182
10 Data Leakage and Model Hallucination 191
Technical Mechanisms of Data Leakage in Generative Models 191
Membership and Attribute Inference via Generative Outputs 195
Model Inversion and Training Data Reconstruction 197
Hallucination Exploitation in Generative Outputs 199
Prompt-Based Extraction of Memorized Data 202
Exploiting Multi-Modal and Cross-Modal Leakage in Generative Models 204
11 Adversarial Fine-Tuning and Model Reprogramming 213
Technical Foundations of Adversarial Fine-Tuning 213
Semantic Perturbation Methods for Adversarial Fine-Tuning 216
Embedding Covert Behaviors via Adversarial Prompt Conditioning 219
Advanced Trojan Embedding via Fine-Tuning Gradients 221
Cross-Model and Transferable Adversarial Fine-Tuning Attacks 223
Model Reprogramming via Adversarial Fine-Tuning Techniques 226
12 Agentic AI and Autonomous Threat Loops 235
Technical Foundations of Agentic AI Systems 235
Technical Manipulation of Autonomous Decision Loops 238
Exploitation of Agentic Memory and Context Management 241
Agentic Tool Integration and External API Exploitation 244
Technical Embedding of Autonomous Chain Injection 246
Exploitation of Environmental Interactions and Stateful Vulnerabilities 248
13 Securing the AI Supply Chain 257
Technical Mechanisms of Supply Chain Poisoning in AI Models 257
Artifact and Model Checkpoint Contamination Techniques 260
Technical Exploitation of Third-Party AI Libraries and Frameworks 263
Dataset Provenance and Annotation Manipulation Techniques 265
Technical Exploitation of Hosted and Cloud-based Model Infrastructure 268
Artifact Repositories and Model Zoo Contamination Methods 270
14 Evaluating AI Robustness and Response Strategies 277
Technical Foundations of AI Robustness Evaluation 277
Metrics for Evaluating AI Security and Robustness 279
Robust Optimization Methods and Adversarial Training 282
Certified Robustness and Formal Verification Techniques 285
Technical Benchmarking Tools and Evaluation Frameworks 287
Technical Analysis of Robustness Across Model Architectures and Modalities 289
15 Building Trustworthy AI by Design 299
Technical Foundations of Security-by-Design in AI Systems 299
Robust Embedding and Representation Learning Methods 302
Technical Approaches to Adversarially Robust Architectures 304
Technical Integration of Formal Verification in Model Design 306
Technical Frameworks for Runtime Anomaly Detection and Filtering 308
Technical Embedding of Model Interpretability and Transparency 310
16 Looking Ahead—Security in the Era of Intelligent Agents 319
Technical Foundations of Future Agentic AI Systems 319
Emerging Technical Attack Vectors in Agentic Systems 322
Technical Exploitation of Multi-Modal and Cross-Domain Agentic Capabilities 325
Future Technical Capabilities in Automated Adversarial Generation 327
Technical Mechanisms for Evaluating Advanced Agentic Robustness 330
Technical Embedding of Ethical Constraints and Safety Mechanisms 332
Recommendations 335
Conclusion 337
Key Concepts 337
Glossary 341
Index 367




