Topol | In-Memory Analytics with Apache Arrow | E-Book | www.sack.de
E-Book

E-Book, Englisch, 406 Seiten

Topol In-Memory Analytics with Apache Arrow

Accelerate data analytics for efficient processing of flat and hierarchical data structures
2. Auflage 2025
ISBN: 978-1-83546-968-2
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection

Accelerate data analytics for efficient processing of flat and hierarchical data structures

E-Book, Englisch, 406 Seiten

ISBN: 978-1-83546-968-2
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection



Apache Arrow is an open source, columnar in-memory data format designed for efficient data processing and analytics. This book harnesses the author's 15 years of experience to show you a standardized way to work with tabular data across various programming languages and environments, enabling high-performance data processing and exchange.
This updated second edition gives you an overview of the Arrow format, highlighting its versatility and benefits through real-world use cases. It guides you through enhancing data science workflows, optimizing performance with Apache Parquet and Spark, and ensuring seamless data translation. You'll explore data interchange and storage formats, and Arrow's relationships with Parquet, Protocol Buffers, FlatBuffers, JSON, and CSV. You'll also discover Apache Arrow subprojects, including Flight, SQL, Database Connectivity, and nanoarrow. You'll learn to streamline machine learning workflows, use Arrow Dataset APIs, and integrate with popular analytical data systems such as Snowflake, Dremio, and DuckDB. The latter chapters provide real-world examples and case studies of products powered by Apache Arrow, providing practical insights into its applications.
By the end of this book, you'll have all the building blocks to create efficient and powerful analytical services and utilities with Apache Arrow.

Topol In-Memory Analytics with Apache Arrow jetzt bestellen!

Weitere Infos & Material


Topol Matthew :

Matthew Topol is a member of the Apache Arrow Project Management Committee (PMC) and a staff software engineer at Voltron Data, Inc. Matt has worked in infrastructure, application development, and large-scale distributed system analytical processing for financial data. At Voltron Data, Matt's primary responsibilities have been working on and enhancing the Apache Arrow libraries and associated sub-projects. In his spare time, Matt likes to bash his head against a keyboard, develop and run delightfully demented fantasy games for his victims—er—friends, and share his knowledge and experience with anyone interested enough to listen.



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.