OpenAI with strict JSON Schema outputs
Table of Contents
Indeed SDKs for calling LLM services are "just wrappers for HTTP requests", but it abstracts some complexity around serialization, sweetening your code a little.
Python OpenAI SDK, the example we'll use here, interfaces with Pydantic to let
users define the structured outputs as code: types classes. This is helpful and
convenient, but here we'll quickly see how reproduce the exact call from any HTTP
client, including cURL.
And no worries, this is about "OpenAI-compatible" APIs, so your local LLM might maintain the same contracts.
Python OpenAI SDK
Boilerplate
To reduce the magic, let's setup everything from scratch. I'll be using uv
for virtual env and package manager.
pyproject.toml
[project] name = "openai-json-schema" version = "0.0.1" requires-python = ">=3.13"
Now setup venv and add dep:
uv venv --allow-existing uv add openai pydantic
Code
from json import dumps from openai import OpenAI from openai.lib._pydantic import to_strict_json_schema from openai.types.chat import ( ChatCompletionSystemMessageParam, ChatCompletionUserMessageParam, ) from pydantic import BaseModel, TypeAdapter # Defining the Structured Output model class ResponseModel(BaseModel): "This MUST start with 'Hey'." # Doc string is included as object description message: str client = OpenAI(api_key=openai_api_key) system_message: ChatCompletionSystemMessageParam = { "role": "system", "content": "Start message with 'Hello'.", # Contradicting the model description for test # Expected: Take model description as override. } user_message: ChatCompletionUserMessageParam = {"role": "user", "content": "Say hi"} # Call SDK completion = client.chat.completions.parse( model="gpt-4.1", messages=[system_message, user_message], temperature=0.6, response_format=ResponseModel, ) # Use the high-in-sugar `parsed` pre-processed object choice_message = completion.choices[0].message.parsed # Completion result print(choice_message.model_dump_json(indent=2)) # And inspecting how OpenAI builds the JSON Schema with its internal method # because it makes the schema stricter. print("// model") print(dumps(to_strict_json_schema(TypeAdapter(ResponseModel)), ensure_ascii=False, indent=2))
{
"message": "Hey! How can I help you today?"
}
// model
{
"description": "This MUST start with 'Hey'.",
"properties": {
"message": {
"title": "Message",
"type": "string"
}
},
"required": [
"message"
],
"title": "ResponseModel",
"type": "object",
"additionalProperties": false
}
Easy, but now we know: The Structured Output feature is simple as describing the output format in JSON Schema.
Raw call
Now we can reproduce this with ease. The schema is simply informed at .response_format.json_schema.schema:
curl -s "https://api.openai.com/v1/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $token" \ -d "$(jq -nc '{ model: "gpt-4.1", messages: [ {role: "system", content: "Start message with Hello."}, {role: "user", content: "Say hi"} ], response_format: { type: "json_schema", json_schema: { name: "messageSchema", strict: true, schema: { type: "object", additionalProperties: false, required: ["message"], properties: { message: { type: "string", description: "This MUST start with Hey." } } } } }, temperature: 0.6 }')" | jq '.id = "REDACTED" | .system_fingerprint = "REDACTED"' # Ignore this line :)
As we can see, the result is stringified in .choices.[0].message.content:
echo "$openai_json" | jq '.choices.[0].message.content | fromjson'
End
And this is the end, you now know how write raw HTTP requests to OpenAI-compatible LLM APIs.
See soon.