Azarmi | Scalable Big Data Architecture | E-Book | www.sack.de
E-Book

E-Book, Englisch, 141 Seiten, eBook

Azarmi Scalable Big Data Architecture

A practitioners guide to choosing relevant Big Data architecture
1. Auflage 2015
ISBN: 978-1-4842-1326-1
Verlag: APRESS
Format: PDF
Kopierschutz: 1 - PDF Watermark

A practitioners guide to choosing relevant Big Data architecture

E-Book, Englisch, 141 Seiten, eBook

ISBN: 978-1-4842-1326-1
Verlag: APRESS
Format: PDF
Kopierschutz: 1 - PDF Watermark



This

book highlights the different types of data architecture and illustrates the

many possibilities hidden behind the term "Big Data", from the usage of No-SQL

databases to the deployment of stream analytics architecture, machine learning,

and governance.
Scalable

Big Data Architecture covers

real-world, concrete industry use cases that leverage complex distributed

applications , which involve web applications, RESTful API, and high throughput

of large amount of data stored in highly scalable No-SQL data stores such as

Couchbase and Elasticsearch. This book demonstrates how data processing can be

done at scale from the usage of NoSQL datastores to the combination of Big Data

distribution.
When

the data processing is too complex and involves different processing topology

like long running jobs, stream processing, multiple data sources correlation,

and machine learning, it’s often necessary to delegate the load to Hadoop or

Spark and use the No-SQL to serve processed data in real time.
This

book shows you how to choose a relevant combination of big data technologies

available within the Hadoop ecosystem. It focuses on processing long jobs,

architecture, stream data patterns, log analysis, and real time analytics. Every

pattern is illustrated with practical examples, which use the different open

sourceprojects such as Logstash, Spark, Kafka, and so on.
Traditional

data infrastructures are built for digesting and rendering data synthesis and

analytics from large amount of data. This book helps you to understand why you

should consider using machine learning algorithms early on in the project,

before being overwhelmed by constraints imposed by dealing with the high

throughput of Big data.
Scalable

Big Data Architecture is for

developers, data architects, and data scientists looking for a better

understanding of how to choose the most relevant pattern for a Big Data project

and which tools to integrate into that pattern.

Azarmi Scalable Big Data Architecture jetzt bestellen!

Zielgruppe


Popular/general


Autoren/Hrsg.


Weitere Infos & Material


Chapter 1: I think I have a Big (data) Problem (20 pages)

Chapter Goal: This chapter aims to introduce you to the topology of common existing limitations when it comes to dealing with large amounts of data, and what are the common solutions to those problems. The goal here is to lay down the foundation of a heterogeneous architecture that will be described in the following chapters.

1- Identifying Big Data symptoms

2- Understanding the Big Data projects ecosystem

3- Creating the foundation of a long term Big Data architecture

Chapter 2: Early Big Data with No-SQL (30 pages)

Chapter Goal: This chapter aims to describe how a No-SQL database can be a starting point for your Big Data project, how it can deal with large amounts of data, what are the limits of this model and how it can be scaled to a full-fledged Big Data project.

1- Choosing the right No-SQL database

2- Introduction to Couchbase

3- Introduction to Elasticsearch

4- Using No-SQL cache in a SQL based architecture

Chapter 3: Big Data processing jobs topology (30 pages)

Chapter Goal: The more data you get, the more important it is to split the processing into different jobs depending on the topology of the processing.

1- Big Data Job processing strategy

2- Smart data extraction from No-SQL database

3- Short term processing jobs.

4- Long term processing jobs.

Chapter 4: Big Data Streaming Pattern (30 pages)

Chapter Goal: This chapter helps the readers to understand what are their options when it comes to dealing with streaming high data throughput.

1- Identifying streaming data sources

2- Streaming with Big Data projects (Flume) versus Enterprise Service Bus

3- Processing architecture for stream data

Chapter 5: Querying and Analysing Patterns (30 pages)

Chapter Goal: In this chapter, the readers will understand how to leverage the processing work through long term & real time data querying.

1- "Process then Query" strategy versus real-time querying

2- Process, store and query data in Elasticsearch

3- Real-Time querying using Spark

Chapter 6: How About Learning from your Data? (30 pages)

Chapter Goal: This chapter will introduce the concept of machine learning at different level of the preceding described patterns and through different relative methodology.

1- Introduction to machine learning

2- Supervised and Unsupervised learning

3- A simple example of Machine learning

4- Using MLlib for machine learning

Chapter 7: Governance Considerations (20 pages)

Chapter Goal: Monitoring, and more generally governance is extremely important when dealing with architecture that involves all the previous patterns. This chapter is to safeguard the reader from major issues, and to gain visibility and control over the architecture.

1- Data Quality

2- Architecture Scalability

3- Security

4- Monitoring     


is the co-founder and CTO of reach five, a Social Data Marketing Platform. Bahaaldine has a strong background and expertise skills in REST API and Big Data architecture. Prior to founding reach five, Bahaaldine worked as a technical architect & evangelist for large software vendors such as Oracle & Talend.

He has a master’s degree of computer science from Polytech’Paris engineering school, Paris.



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.