E-Book, Englisch, 338 Seiten
Funderburk Building Natural Language and LLM Pipelines
1. Auflage 2025
ISBN: 978-1-83546-700-8
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection
Build production-grade RAG, tool contracts, and context engineering with Haystack and LangGraph
E-Book, Englisch, 338 Seiten
ISBN: 978-1-83546-700-8
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection
Modern LLM applications often break in production due to brittle pipelines, loose tool definitions, and noisy context. This book shows you how to build production-ready, context-aware systems using Haystack and LangGraph. You'll learn to design deterministic pipelines with strict tool contracts and deploy them as microservices. Through structured context engineering, you'll orchestrate reliable agent workflows and move beyond simple prompt-based interactions.
You'll start by understanding LLM behavior-tokens, embeddings, and transformer models-and see how prompt engineering has evolved into a full context engineering discipline. Then, you'll build retrieval-augmented generation (RAG) pipelines with retrievers, rankers, and custom components using Haystack's graph-based architecture. You'll also create knowledge graphs, synthesize unstructured data, and evaluate system behavior using Ragas and Weights & Biases. In LangGraph, you'll orchestrate agents with supervisor-worker patterns, typed state machines, retries, fallbacks, and safety guardrails.
By the end of the book, you'll have the skills to design scalable, testable LLM pipelines and multi-agent systems that remain robust as the AI ecosystem evolves.
*Email sign-up and proof of purchase required
Autoren/Hrsg.
Weitere Infos & Material
Preface
Between 2023 and 2025, the release of OpenAI’s large language models (LLMs) as REST endpoints captivated professionals across industries with their ability to understand and respond to natural language. In 2024, we marveled at their ability to produce context-aware answers grounded in a corpus of documents. This approach, known as retrieval-augmented generation (RAG), quickly solidified itself as a cornerstone technique in modern AI.
As we experimented with this technology, we discovered that LLMs could do more than answer questions: they could be extended to use tools to solve problems. This unlocked new possibilities for software engineers, who began developing the tools necessary to enable what we now, in 2025, refer to as agents. While the capabilities introduced by RAG and agents represent an exciting step toward more capable artificial intelligence, the path to integrating agents into real-world systems remains fraught with challenges. Looking ahead to 2026 and beyond, this book argues that the focus will move beyond capability and center on reliability.
A recurring theme throughout the evolution from LLMs to RAG, and from RAG to agents is the presence of hallucinations, a phenomenon in which an LLM produces coherent-sounding responses that are false. This phenomenon is at the center of what this book refers to the “agentic reliability crisis of 2025”. In this book, we will prove that an agent powered by an LLM is only as robust as the data, tools, and context it is provided. We will show that reliability and systems integrity within an agentic system are not inherent properties of LLMs, but they are the result of careful engineering and systems design.
The central argument of this book is simple: the path to production-grade, trustworthy AI hinges on the rigorous application of classic data processing techniques. We introduce the tool vs. orchestration layer pattern, in which high-quality, robust, and scalable tools are developed as microservices, and agents are then equipped to use those tools through disciplined context engineering. This includes four core strategies: write, select, compress, and isolate. These strategies help us carefully manage the information the agent receives at each step of its problem-solving process, allowing it to efficiently and accurately resolve the task it was given.
By the end of this book, you will master two graph-based architectures and learn how to combine them to build highly dynamic, yet observable and resilient agentic systems:
- Haystack: A specialized framework that uses directed graphs for building robust, data-intensive, and scalable tools.
- LangGraph: A low-level orchestration framework to create customizable agents, and diverse control flows for an LLM to execute the tools they’re given.
This book has a strong focus on practical examples. You will find plenty of Jupyter notebooks, Python and shell scripts, and Docker container images to help you apply the concepts introduced in each chapter. Through a series of mini-projects, you will progressively build expertise, and you will conclude your journey with a blueprint for creating sovereign agents: fully owned AI that can run locally or on edge devices.
Who this book is for
This book is for NLP engineers, LLM application developers, and data scientists looking for stable, testable building blocks for retrieval, summarization, and ranking. It is ideal for technical leads and architects designing production-grade LLM tools, as well as teams tasked with modernizing legacy NLP pipelines into robust RAG and agentic systems.
What this book covers
, , defines the agentic reliability crisis of 2025 and reframes classic data pipelines as the foundational reliability layer required for autonomous agents. It introduces text processing fundamentals like tokenization and embeddings as the prerequisite for building trustworthy AI.
, , introduces context engineering as a formal discipline for managing information environments and traces the evolution of models from the 2023 baselines to the specialized reasoning engines of 2025.
, , explores the explicit, graph-based architecture of Haystack 2.0 for building tools with strict data contracts.
, , demonstrates how to construct production-grade indexing, multimodal, and hybrid RAG pipelines to solve complex retrieval problems like "vocabulary mismatch".
, , focuses on extending the Haystack framework by building specialized components, such as a knowledge graph generator and a synthetic test data generator from PDFs and scrapped websites.
, , details how to transition from experimentation to engineering using Docker and uv, while implementing quantitative evaluation with RAGAS and observability with Weights & Biases.
, , compares deployment strategies using FastAPI for custom control and Hayhooks for rapid REST API generation from serialized YAML pipelines.
, , applies the tool vs. orchestration thesis introduced in this book through a series of projects for classic NLP, such as named-entity recognition, text classification, and sentiment analysis, culminating in the multi-agent system built with LangGraph and Haystack microservices.
, , explores the cutting edge of 2026, including hardware optimization with NVIDIA NIMs and emerging protocols like Model Context Protocol (MCP) and Agent-to-Agent (A2A).
, , provides a final synthesis of the book's journey, analyzing the evolution of agentic architectures through case studies by evaluating three different kinds of agentic architectures against token economics and system integrity when microservices fail.
To get the most out of this book
To get the most out of this book, you should have a solid grasp for Python. Familiarity with core data science projects is recommended but not required. This book is written as a code-heavy, architecture-first guide designed for practitioners ready to transition from experimental scripts to architecting robust, containerized, and stateful agentic applications. It is not intended for non-technical users seeking no-code solutions or prompt hacks.
It is recommended that you use an IDE to interact with the code, such as VSCode, Cursor, or PyCharm. If you’re a Windows user, it is strongly recommended you use WLS (https://code.visualstudio.com/docs/remote/wsl). We will use uv for package management and a separate virtual environment is used for the exercises in each chapter. You will need to install Docker desktop (https://docs.docker.com/desktop/) to complete the exercises for and . Whereas the Jupyter notebooks and scripts use OpenAI and as such require the use of an OpenAI API key, commented code snippets are provided in the code to enable you to work with a local model using Ollama. The only exception to this is in , where we will use an OpenAI model as an LLM as a judge, and measure token and cost usage of using large and small embedding models.
Download the example code files
The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Building-Natural-Language-and-LLM-Pipelines. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing. Check them out!
Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here:https://packt.link/gbp/9781835467992
Conventions used
There are a number of text conventions used throughout this book.
: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: "Through the method, we can easily create a Mermaid graph of our data flow."
A block of code is set as follows:
Any command-line input or output is written as follows:
Bold:...




