Gazit / Ghaffari | Mastering NLP From Foundations to Agents | E-Book | www.sack.de
E-Book

E-Book, Englisch, 694 Seiten

Gazit / Ghaffari Mastering NLP From Foundations to Agents

Building AI Agents through Agentic Automation and RAG Workflows with Python
2. Auflage 2026
ISBN: 978-1-80610-612-7
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection

Building AI Agents through Agentic Automation and RAG Workflows with Python

E-Book, Englisch, 694 Seiten

ISBN: 978-1-80610-612-7
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection



Natural Language Processing has evolved beyond rule-based systems and classical machine learning (ML). This second edition guides you through that transformation from mathematical and ML foundations to large language models, retrieval pipelines, agentic automation, and AI-native system design. It strengthens core NLP concepts while expanding into modern architectures such as transformers, parameter-efficient fine-tuning (LoRA and QLoRA), and alignment methods like RLHF and DPO.
You'll begin with essential linear algebra, probability, and ML principles before moving into text preprocessing, feature engineering, classification pipelines, and deep learning architectures. From there, the focus shifts to system design: building Retrieval-Augmented Generation (RAG) pipelines, implementing model routing strategies that balance cost and performance, and orchestrating structured multi-agent workflows. You'll also introduce structured interoperability patterns, including the Model Context Protocol (MCP). Governance and safety will be treated as architectural concerns, demonstrating how policy and compliance can be integrated directly into AI systems. By the end, you will have the tools to implement NLP techniques and be equipped to design, govern, and deploy intelligent systems built on them.
*Email sign-up and proof of purchase required

Gazit / Ghaffari Mastering NLP From Foundations to Agents jetzt bestellen!

Weitere Infos & Material


1


An Introduction to the NLP Landscape


This first chapter is aimed at helping professionals gain a foundation in natural language processing (NLP) by introducing its key concepts, early strategies for machine processing of language, and its synergy with machine learning (ML). We also highlight the importance of mathematical foundations such as linear algebra, statistics, probability, and optimization theory, which are necessary to understand the algorithms used in NLP.

We will discuss some of the initial challenges faced in NLP, such as understanding the context and meaning of words, the relationships between them, and how the traditional methods for understanding those characteristics require labeled data. We will then touch on more recent advancements, including pre-trained language models such as BERT and GPT, and the availability of large amounts of text data, which have led to improved performance on NLP tasks. These models, as we will show later in the book, leverage methods that require much less labeled data.

This introduction will engage you by showing how NLP and ML come together to form more accurate and effective systems, laying the groundwork for the more advanced topics covered later in the book.

We will be covering the following topics in the chapter:

  • What is natural language processing?
  • Foundational strategies for processing natural language
  • The synergy of NLP and ML
  • Introduction to math and statistics in NLP

Your purchase includes a free PDF copy + exclusive extras


Your purchase includes a DRM-free PDF copy of this book, the code bundle, and additional exclusive extras. See the section in the to unlock them instantly and maximize your learning.

The evolution of NLP


NLP is a field of artificial intelligence (AI) focused on the interaction between computers and human languages. NLP involves using computational techniques to understand, interpret, and generate human language, making it possible for computers to understand and respond to human input naturally and meaningfully. The history of NLP is a fascinating journey through time, tracing back to the 1950s, with significant contributions from pioneers such as Alan Turing.

Turing’s paper, , did far more than pose the simple question “Can machines think?”; it reframed the problem into the Imitation Game, later known as the Turing Test. This shift from vague philosophical inquiry to an operational test was groundbreaking because it offered a practical criterion for assessing machine intelligence through language. Turing also outlined how digital computers, as universal machines, could in principle simulate any process of reasoning, foreshadowing much of modern AI. He anticipated objections ranging from theological to mathematical and argued that machines could, with sufficient programming and memory, eventually rival human thought. Importantly, he emphasized the idea of learning machines, suggesting that rather than building adult-level intelligence directly, one could design machines that learn and develop like children. These insights made his essay not only a philosophical milestone but also a technical roadmap that directly influenced the trajectory of NLP and AI.

This period marked the inception of symbolic NLP, characterized by using rule-based systems, such as the notable Georgetown experiment in 1954, which ambitiously aimed to solve machine translation by generating a translation of Russian content into English (see https://en.wikipedia.org/wiki/Georgetown%E2%80%93IBM_experiment). Despite early optimism, progress was slow, revealing the complexities of language understanding and generation.

The 1960s and 1970s saw the development of early NLP systems, which demonstrated the potential for machines to engage in human-like interactions using limited vocabularies and knowledge bases. This era also witnessed the creation of conceptual ontologies, crucial for structuring real-world information in a computer-understandable format.

One of the better-known ontology-style systems of that era is Conceptual Dependency Theory, developed by Roger Schank in 1969. It provided a way to represent the meaning of sentences independent of the words used, by mapping actions, actors, times, and locations into a structured set of primitive actions and transitions. Another system is KL-ONE, from around 1974, which introduced a frame-based taxonomy (hierarchies of concepts, with roles) for organizing domain knowledge and enabling inference about class membership. These systems illustrated early attempts to formalize knowledge representations that could underlie machine understanding of language. They played a foundational role in showing both the power and limits of manually constructed knowledge structures.

However, the limitations of rule-based methods led to a paradigm shift in the late 1980s towards statistical NLP, fueled by advances in ML and increased computational power. This shift enabled more effective learning from large corpora, significantly advancing machine translation and other NLP tasks. This paradigm shift not only represented a technological and methodological advancement but also underscored a conceptual evolution in the approach to linguistics within NLP. In moving away from the rigidity of predefined grammar rules, this transition embraced corpus linguistics, a method that allows machines to “perceive” and understand languages through extensive exposure to large bodies of text. This approach reflects a more empirical and data-driven understanding of language, where patterns and meanings are derived from actual language use rather than theoretical constructs, enabling more nuanced and flexible language processing capabilities.

Entering the 21st century, the emergence of the web provided vast amounts of data, catalyzing research in unsupervised and semi-supervised learning algorithms. The breakthrough came with the advent of neural NLP in the 2010s, where DL techniques began to dominate, offering unprecedented accuracy in language modeling and parsing. This era has been marked by the development of sophisticated models such as Word2Vec and the proliferation of deep neural networks, driving NLP towards more natural and effective human-computer interaction. As we continue to build on these advancements, NLP stands at the forefront of AI research, with its history reflecting a relentless pursuit of understanding and replicating the nuances of human language.

In recent years, NLP has also been applied to a wide range of industries, such as healthcare, finance, and social media, where it has been used to automate decision-making and enhance communication between humans and machines. For example, NLP has been used to extract information from medical documents, analyze customer feedback, translate documents between languages, and search through enormous amounts of posts.

Foundational strategies for processing natural language


Traditional methods in NLP consist of text preprocessing, which is synonymous with text . Preprocessing text is an essential step in NLP and ML applications. It involves cleaning and transforming the original text data into a form that can be easily understood and analyzed by ML algorithms. The goal of preprocessing is to remove noise and inconsistencies and standardize the data, making it more suitable for advanced NLP and ML methods such as linear models, decision trees, ensembles of trees, and various other models that need to be trained.

One of the key benefits of preprocessing is that it can significantly improve the performance of ML algorithms. For example, removing stop words, which are common words that do not carry much meaning, such as “the” and “is,” can help reduce the dimensionality of the data, making it more likely for the algorithm to identify relevant patterns, and making it less likely for the algorithm to mistakenly identify irrelevant patterns that may be formed by those irrelevant stop words.

Take the following sentence as an example:

After removing the stop words, we have the following:

In the example sentence, the stop words “I,” “am,” “to,” “the,” “some,” and “and” do not add any additional meaning to the sentence and can be removed without changing the overall meaning of the sentence. It should be emphasized that the removal of stop words needs to be tailored to the specific objective, as the omission of a particular word might be trivial in one context but detrimental in another.

For instance, consider the stop word “some” in the following example:

If “some” is removed, the sentence becomes:

This...



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.