AI Engineering

Structured Output: Getting LLMs to Return JSON, Schemas, and Typed Data

Цели урока

Understand the problem of unpredictable LLM response format and why prompt instructions aren't enough
Master JSON Mode (response_format) as the first level of guarantees
Learn to use Zod schemas with the OpenAI SDK for type-safe structured output
Understand function calling - the mechanism for invoking external functions through LLMs
Build a robust error handling system: retry, fallback, validation

JSON mode doesn't guarantee valid JSON. It guarantees the model will TRY to return valid JSON. The difference is a production incident at 3 AM: `JSON.parse` throws, the pipeline stops, the pager fires. Structured output is a contract. LLMs break contracts. That's exactly why Zod earns its keep - not hope.

Linear AI extracts structured data from free text - title, priority, assignee - and creates issues without a single click. Structured Output + Zod schema
Notion AI parses unstructured notes into databases with 99.9% reliability. Without Structured Output it's 80% - unacceptable for a product
GitHub Copilot uses function calling for IDE integration - file search, running tests, code navigation. The model decides what to call; the IDE executes
Vercel v0 generates React components via function calling - the model "calls" a UI creation function with component parameters

Evolution of Structured Output in the OpenAI API

**June 2023**: function calling - the model can choose functions and return structured arguments. The first step toward AI agents. **November 2023**: JSON Mode (`response_format: { type: 'json_object' }`) - guaranteed valid JSON, but no schema control. **August 2024**: Structured Outputs - constrained decoding at the token level. The model physically cannot violate the schema. Zod integration via `zodResponseFormat`. **Anthropic** in parallel added `tool_use` (the equivalent of function calling) and the prefill trick for JSON. In eighteen months the industry went from "ask it to return JSON" to guaranteed typed contracts.

Предварительные знания

Production Prompt Patterns: system/user/assistant, Few-Shot, Chain-of-Thought

The Problem: LLM Returns a String, but Backend Expects an Object

LLMs return **free-form text** by default. That's a fundamental problem: backend systems don't operate on arbitrary strings - they need parsed objects with a predictable structure. Asking a model to "return JSON" is asking it to try. Whether it tries at 3 AM under production load is a different question.

The model knows JSON perfectly well - that's not the issue. The issue is the **absence of a contract**. The model treats "return JSON" as a stylistic suggestion, not an engineering requirement. Production needs 100% format reliability, not 90%. The gap between 90% and 100% is the difference between a caught parse error and a Slack incident at 3 AM.

Approach	Reliability	When to Use
Prompt "return JSON"	~80-90%	Prototypes, not for production
JSON Mode (response_format)	~99%	When valid JSON is needed without a strict schema
Structured Output (Zod schema)	~99.9%	Production: guaranteed structure
Function Calling	~99.9%	When the model needs to invoke functions

**Even with JSON Mode and Structured Output, edge cases happen.** The model might return `null` in a required field or an empty string instead of a value. Application-side validation is always mandatory.

JSON mode guarantees valid JSON

JSON mode guarantees the model will TRY to return valid JSON. Field structure is still arbitrary - client-side validation is required

With `response_format: { type: 'json_object' }`, `JSON.parse` won't throw. But the model can return `{"person": "Anna"}` instead of `{"name": "Anna"}` - and that's valid JSON. For structure guarantees, Structured Output with a Zod schema is needed. JSON mode removes the markdown wrapper. Structured Output removes field unpredictability. These are different levels of guarantees.

Why is the "ask the model to return JSON in the prompt" approach unreliable for production?

JSON Mode and response_format: The First Level of Guarantees

June 2023. OpenAI ships `response_format: { type: 'json_object' }` - **JSON Mode**. The model stops wrapping responses in markdown and adding explanatory text. Clean JSON only. `JSON.parse` stops throwing... as long as the structure doesn't matter.

**JSON Mode requires mentioning JSON in the system prompt.** If the prompt doesn't contain the word "JSON", the API returns an error. This is a safeguard against accidentally enabling JSON Mode.

JSON Mode is only the first level. Valid JSON != correct JSON. The model can return `{"person": "Anna", "mail": "anna@mail.com"}` instead of `{"name": "Anna", "email": "anna@mail.com"}` - and `JSON.parse` will swallow that without an error. Field structure is only controlled by the next level: Structured Output.

**The prefill trick for Claude** works but is inelegant. Anthropic added tool_use (function calling) support, which provides structured output more reliably. More on this in the upcoming concepts.

How does JSON Mode (`response_format: { type: 'json_object' }`) differ from Structured Output?

Zod + OpenAI: Type-Safe Structured Output

August 2024. OpenAI ships Structured Outputs - **constrained decoding**: the model physically cannot generate a token that violates the schema. Not "tries to return" but "cannot not return". That's a structural shift - it's no longer an agreement, it's a physical constraint on generation.

The OpenAI SDK supports **Zod schemas** natively via `zodResponseFormat`. The Zod schema describes the exact structure of the expected response, the SDK converts it to JSON Schema for the API, and constrained decoding guarantees conformance. The result is an already-typed object - no manual parsing needed.

`.parsed` is already typed by the Zod schema - TypeScript knows which fields exist and what types they are. No `JSON.parse`, no `as any`. This is what a backend developer expects everywhere else in the codebase.

**Structured Output limitations:** it doesn't support all JSON Schema capabilities. The system can't use `minItems`, `maxItems`, `pattern`, `format` (except `date-time`). Validation of such constraints must be done application-side after receiving the response.

A Zod schema defines a field `age: z.number().min(18).max(100)`. Does Structured Output guarantee the model returns a number between 18 and 100?

Function Calling: When the Model Needs to Take Action

June 2023. OpenAI adds function calling. That changes everything: an LLM stops being a chatbot and becomes a component capable of **choosing actions**. That's when the era of AI agents truly began - not in the marketing sense, but in the architectural sense.

Important clarification: the model **does not call** the function itself. It returns JSON with the function name and arguments. The application executes it. The model is the "decision-maker," the application is the "executor." This is critical for security: LLMs never get direct access to production systems.

**Function calling vs Structured Output.** Structured Output = "return data in this structure." Function calling = "choose an action and specify parameters." Structured Output is ideal for extraction; function calling is for agents and tool use. They're often used together.

With function calling, the model returns `tool_calls` with a function name and arguments. Who actually executes the function?

Error Handling and Retry Strategies for Structured Output

Structured Output is a contract. LLMs break contracts. Not because they're bad - because they're probabilistic. Even with constrained decoding there are `refusal` responses (safety-related rejections), `null` in required fields, data outside business rules. Robust error handling isn't optional - it's the engineer's side of the contract.

Function calling adds another layer of fragility: arguments arrive as a JSON string. Even with Structured Output the model can generate technically valid JSON with unexpected values. Zod validation is a mandatory step.

**Order of preference in production:** 1. Structured Output with Zod - used by default. 2. On refusal - logging + fallback. 3. On parse error - retry with exponential backoff. 4. If everything fails - fallback to regex extraction. The more defense layers, the more reliable the system.

The model returned a `refusal` (safety-related rejection) on a structured output request. What's the right strategy?

JSON mode = guaranteed valid JSON

JSON mode guarantees the model won't add text around the JSON. Field structure is still arbitrary - validation is still required

With `response_format: { type: 'json_object' }`, `JSON.parse` won't throw. But the model can return `{"person": "Anna"}` instead of `{"name": "Anna"}` - valid JSON, wrong structure. For field-level guarantees, Structured Output with a Zod schema is needed. JSON mode removes the markdown wrapper. Structured Output removes field unpredictability. Different levels, different problems.

Key Concepts

Prompt "return JSON" - unreliable (~80%). JSON Mode - valid JSON without schema (~99%). Structured Output with Zod - guaranteed structure (~99.9%)
Constrained decoding (Structured Outputs, August 2024) - the model physically cannot violate the schema at the token level
zodResponseFormat + openai.beta.chat.completions.parse = type-safe extraction, no JSON.parse, no as any
Structured Output doesn't validate Zod constraints (min, max, pattern) - two-level validation is mandatory: API schema + strict business schema
Function calling (June 2023): the model chooses the function and arguments, the application executes - the foundation of AI agents. LLMs never have direct access to systems
Production: the engineer's side of the contract - retry with backoff + refusal handling + regex fallback = layered defense

What's Next

Structured output ensures a predictable response format. The next step is learning to receive these responses in real-time through streaming.

Streaming: SSE and Real-Time Responses — Receiving structured output in streaming mode - chunk by chunk
Tool Calling (Deeper Dive) — Function calling from this lesson -> full tool use and call chains
AI Agents — Function calling + decision chains = autonomous agents

Вопросы для размышления

In which parts of a current or past project could JSON.parse have thrown due to unexpected format from an external service? Would Structured Output have solved that?
How does constrained decoding actually differ from asking "return JSON" in a prompt? Why are these different levels of guarantees?
When does it make sense to use function calling instead of Structured Output, and vice versa? What's the deciding criterion?

Связанные уроки

aie-06-prompt-patterns — Prompts set up the model before constraining its output
aie-16-tool-calling — Function calling here grows into full tool use
aie-17-agent-fundamentals — Typed output is the substrate for agent decisions
aie-32-error-handling-llm — Schema validation failures need graceful handling
ts-01-why-typescript — JSON Schema validation mirrors static type checking
plt-25-parser