E-Book, Englisch, 178 Seiten
Hoffman Apache Flume: Distributed Log Collection for Hadoop
2. Auflage 2025
ISBN: 978-1-78439-914-6
Verlag: De Gruyter
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Design and implement a series of Flume agents to send streamed data into Hadoop
E-Book, Englisch, 178 Seiten
ISBN: 978-1-78439-914-6
Verlag: De Gruyter
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Key FeaturesBook DescriptionIf you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed.What you will learn - Understand the Flume architecture, and also how to download and install open source Flume from Apache
- Follow along a detailed example of transporting weblogs in Near Real Time (NRT) to Kibana/Elasticsearch and archival in HDFS
- Learn tips and tricks for transporting logs and data in your production environment
- Understand and configure the Hadoop File System (HDFS) Sink
- Use a morphlinebacked Sink to feed data into Solr
- Create redundant data flows using sink groups
- Configure and use various sources to ingest data
- Inspect data records and move them between multiple destinations based on payload content
- Transform data enroute to Hadoop and monitor your data flows
Who this book is for




