thisago's blog


OpenAI with strict JSON Schema outputs

Table of Contents

Indeed SDKs for calling LLM services are "just wrappers for HTTP requests", but it abstracts some complexity around serialization, sweetening your code a little.

Python OpenAI SDK, the example we'll use here, interfaces with Pydantic to let users define the structured outputs as code: types classes. This is helpful and convenient, but here we'll quickly see how reproduce the exact call from any HTTP client, including cURL.

And no worries, this is about "OpenAI-compatible" APIs, so your local LLM might maintain the same contracts.

Python OpenAI SDK

Boilerplate

To reduce the magic, let's setup everything from scratch. I'll be using uv for virtual env and package manager.

pyproject.toml

[project]
name = "openai-json-schema"
version = "0.0.1"
requires-python = ">=3.13"

Now setup venv and add dep:

uv venv --allow-existing
uv add openai pydantic

Code

from json import dumps

from openai import OpenAI
from openai.lib._pydantic import to_strict_json_schema
from openai.types.chat import (
    ChatCompletionSystemMessageParam,
    ChatCompletionUserMessageParam,
)
from pydantic import BaseModel, TypeAdapter


# Defining the Structured Output model
class ResponseModel(BaseModel):
    "This MUST start with 'Hey'." # Doc string is included as object description

    message: str


client = OpenAI(api_key=openai_api_key)

system_message: ChatCompletionSystemMessageParam = {
    "role": "system",
    "content": "Start message with 'Hello'.", # Contradicting the model description for test
                                              # Expected: Take model description as override.
}
user_message: ChatCompletionUserMessageParam = {"role": "user", "content": "Say hi"}

# Call SDK
completion = client.chat.completions.parse(
    model="gpt-4.1",
    messages=[system_message, user_message],
    temperature=0.6,
    response_format=ResponseModel,
)
# Use the high-in-sugar `parsed` pre-processed object
choice_message = completion.choices[0].message.parsed

# Completion result
print(choice_message.model_dump_json(indent=2))


# And inspecting how OpenAI builds the JSON Schema with its internal method
# because it makes the schema stricter.
print("// model")
print(dumps(to_strict_json_schema(TypeAdapter(ResponseModel)), ensure_ascii=False, indent=2))
{
  "message": "Hey! How can I help you today?"
}
// model
{
  "description": "This MUST start with 'Hey'.",
  "properties": {
    "message": {
      "title": "Message",
      "type": "string"
    }
  },
  "required": [
    "message"
  ],
  "title": "ResponseModel",
  "type": "object",
  "additionalProperties": false
}

Easy, but now we know: The Structured Output feature is simple as describing the output format in JSON Schema.

Raw call

Now we can reproduce this with ease. The schema is simply informed at .response_format.json_schema.schema:

curl -s "https://api.openai.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $token" \
  -d "$(jq -nc '{
  model: "gpt-4.1",
  messages: [
    {role: "system", content: "Start message with Hello."},
    {role: "user", content: "Say hi"}
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "messageSchema",
      strict: true,
      schema: {
        type: "object",
        additionalProperties: false,
        required: ["message"],
        properties: {
          message: {
            type: "string",
            description: "This MUST start with Hey."
          }
        }
      }
    }
  },
  temperature: 0.6
}')" |
  jq '.id = "REDACTED" | .system_fingerprint = "REDACTED"' # Ignore this line :)

As we can see, the result is stringified in .choices.[0].message.content:

echo "$openai_json" | jq '.choices.[0].message.content | fromjson'

End

And this is the end, you now know how write raw HTTP requests to OpenAI-compatible LLM APIs.

See soon.