
GIVE THE PERFECT GIFT
Erin Mills Town Centre Gift Cards are the perfect choice for your gift giving needs.Purchase gift cards at kiosks near the food court or centre court, at Guest Services, or click below to purchase online.PURCHASE HEREHome
SGLang: Structured Generation for Tool Use, JSON Outputs, and Fast Inference
Indigo
Loading Inventory...
SGLang: Structured Generation for Tool Use, JSON Outputs, and Fast Inference
By None
Current price: $13.64


By None
SGLang: Structured Generation for Tool Use, JSON Outputs, and Fast Inference
Current price: $13.64
Loading Inventory...
Size: Kobo eBook
*Product information may vary - to confirm product availability, pricing, shipping and return information please contact Indigo
"SGLang: Structured Generation for Tool Use, JSON Outputs, and Fast Inference"
Large language models are most valuable when they produce outputs that software can trust, tools can execute, and production systems can serve efficiently. This book is written for experienced developers, ML engineers, and infrastructure-minded practitioners who want to move beyond prompt tinkering into disciplined, high-performance structured generation. It presents SGLang not merely as a prompting framework, but as a programming model and serving stack for building reliable, machine-oriented LLM systems.
Across the book, readers will learn how to design constrained generation workflows, enforce JSON and schema-based contracts, choose among regex, EBNF, and JSON Schema constraints, and integrate tool-calling patterns with OpenAI-compatible interfaces. The book also examines grammar backends, multi-step validation loops, and the mechanics of constrained decoding, then goes deeper into runtime internals such as prefix caching, continuous batching, scheduling, prefill-decode disaggregation, quantization, and production tuning. The result is a complete technical map from structured outputs to scalable deployment.
Distinguished by its systems-level perspective, this book treats correctness and performance as inseparable concerns. Readers should already be comfortable with modern LLM application development, Python-based tooling, and production deployment concepts. In return, they will gain a rigorous understanding of how to build SGLang-based systems that are robust, observable, version-aware, and
"SGLang: Structured Generation for Tool Use, JSON Outputs, and Fast Inference"
Large language models are most valuable when they produce outputs that software can trust, tools can execute, and production systems can serve efficiently. This book is written for experienced developers, ML engineers, and infrastructure-minded practitioners who want to move beyond prompt tinkering into disciplined, high-performance structured generation. It presents SGLang not merely as a prompting framework, but as a programming model and serving stack for building reliable, machine-oriented LLM systems.
Across the book, readers will learn how to design constrained generation workflows, enforce JSON and schema-based contracts, choose among regex, EBNF, and JSON Schema constraints, and integrate tool-calling patterns with OpenAI-compatible interfaces. The book also examines grammar backends, multi-step validation loops, and the mechanics of constrained decoding, then goes deeper into runtime internals such as prefix caching, continuous batching, scheduling, prefill-decode disaggregation, quantization, and production tuning. The result is a complete technical map from structured outputs to scalable deployment.
Distinguished by its systems-level perspective, this book treats correctness and performance as inseparable concerns. Readers should already be comfortable with modern LLM application development, Python-based tooling, and production deployment concepts. In return, they will gain a rigorous understanding of how to build SGLang-based systems that are robust, observable, version-aware, and


















