Minimizing Data Movement and Parameter Count Across the Machine Learning Stack

Everything is a Matrix
Erscheinungsjahr 2026
ISBN: 978-3-032-23099-7
Verlag: Springer

Buch, Englisch, 110 Seiten, Format (B × H): 168 mm x 240 mm

Reihe: Synthesis Lectures on Computer Science

Everything is a Matrix

Buch, Englisch, 110 Seiten, Format (B × H): 168 mm x 240 mm

Reihe: Synthesis Lectures on Computer Science

ISBN: 978-3-032-23099-7
Verlag: Springer

42,79 €

(inkl. MwSt.)

versandkostenfreie Lieferung
Lieferfrist: bis zu 10 Werktage

Bücher versandkostenfrei

kostenlose Rücksendung

This book provides a focused, research-forward guide to making large AI models efficient in practice and also presents an array of novel techniques to reduce memory footprint, accelerate computation, and improve overall hardware utilization. The author demonstrates that substantial efficiency gains can be achieved by rethinking how data is computed, stored, and compressed, with a special focus on matrices, the core computational structure underpinning both scientific computing and neural networks. Modern AI models run on huge grids of numbers (matrices/tensors), and their speed and affordability depend on how those numbers are arranged and processed on real hardware (GPUs/TPUs/CPUs). This book explains practical methods to skip unnecessary work (structured sparsity), move data efficiently (gather/scatter), and shrink models without losing accuracy (block distillation) so that AI systems can use less memory, less time, and less energy without sacrificing quality. In addition, the book shows how to turn algorithmic ideas into hardware-aware speedups on GPUs/TPUs. Readers will learn when sparsity pays off, how to schedule irregular workloads, and how to recover accuracy in compressed models. Case studies illustrate end-to-end design choices, evaluation, and pitfalls. The result is a coherent perspective that bridges theory, compilers/run times, and real-world deployment.

Sabot Minimizing Data Movement and Parameter Count Across the Machine Learning Stack jetzt bestellen!

Zielgruppe

Professional/practitioner

Autoren/Hrsg.

Sabot, Andrew

Fachgebiete

Weitere Infos & Material

Inhaltsverzeichnis

Introduction and Roadmap.- CAKE: Memory-Aware Block Shaping for GEMM.- mCAKE: From Matrices to Tensors.- Rosko: Structured Sparsity for ML Workloads.- Gather/Scatter for Rank-Sliced Activations.- Low-Rank Models via SVD.- Blockwise Knowledge Distillation.- Privacy-Preserving Split Inference (Edge/Cloud).- Conclusion: Design Rules, Evaluation, and Outlook.

Über Autor(innen)

Andrew Sabot, Ph.D., is a Software Engineer working on Machine Learning at Google. He received his Ph.D. (2025) and M.S. (2021) in Computer Science from Harvard University. Dr. Sabot’s work focuses on the intersection of hardware-aware kernels, model compression, and transformer inference acceleration to enable the sustainable deployment of state-of-the-art AI.

Produktsicherheit

Fragen zum Artikel?

Ihre Fragen, Wünsche oder Anmerkungen

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Kundennr.

Ihre Nachricht*

Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.

Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.

42,79 € (inkl. MwSt.)

Lieferfrist: bis zu 10 Werktage

Bücher versandkostenfrei

kostenlose Rücksendung

Webcode: sack.de/xogin