Skip to content

Fix: Instructor Not Working — Validation Loops, Mode Mismatch, Streaming, and Anthropic / Gemini Issues

FixDevs ·

Quick Answer

How to fix Python Instructor errors — ValidationError loops, max_retries exhausted, mode=Mode.TOOLS vs JSON, partial streaming type errors, Anthropic and Gemini client patching, token usage tracking.

The Error

You call an Instructor-patched OpenAI client and the request keeps retrying until it dies:

import instructor
from openai import OpenAI
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = instructor.from_openai(OpenAI())

user = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=User,
    messages=[{"role": "user", "content": "Tell me about Alice"}],
    max_retries=3,
)
# instructor.exceptions.InstructorRetryException:
# 1 validation error for User
# age
#   Input should be a valid integer [type=int_type, input_value='unknown']

Or you switch to Anthropic and it complains about the mode:

ValueError: Mode TOOLS is not supported by Anthropic. Use Mode.ANTHROPIC_TOOLS or Mode.ANTHROPIC_JSON.

Or your partial streaming returns objects with None everywhere:

for partial in client.chat.completions.create_partial(...):
    print(partial)
# User(name=None, age=None)
# User(name=None, age=None)

Or you upgraded openai and now nothing patches:

AttributeError: 'OpenAI' object has no attribute 'chat'

Why This Happens

Instructor wraps an LLM client and re-prompts the model when its output fails Pydantic validation. The wrapping happens in three layers, and most failures map to one of them:

  • Mode mismatch — Instructor uses different prompting strategies depending on what the underlying API supports. OpenAI defaults to Mode.TOOLS (function calling). Anthropic needs Mode.ANTHROPIC_TOOLS. Gemini needs Mode.GEMINI_JSON. Passing the wrong mode for the provider produces silent garbage or hard errors.
  • Validation loop exhaustion — If your response_model is too strict (required field the model can’t infer) or your prompt doesn’t ask for the data, every retry fails and you hit max_retries. The library does the right thing — it surfaces the last validation error so you can fix the prompt or model.
  • Client version skewinstructor.from_openai() and instructor.from_anthropic() expect specific client API shapes. Mixing an old openai<1.0 client with new Instructor breaks the patching.

The streaming None issue is by design but trips everyone the first time: partials yield as fields arrive, so early partials genuinely don’t have most fields set yet.

Fix 1: Pick the Right Mode for Your Provider

Each provider has supported modes. Use the helper instead of guessing:

import instructor
from openai import OpenAI
from anthropic import Anthropic
import google.generativeai as genai

# OpenAI (default mode is TOOLS — usually correct)
openai_client = instructor.from_openai(OpenAI())

# Anthropic — must specify a supported mode
anthropic_client = instructor.from_anthropic(
    Anthropic(),
    mode=instructor.Mode.ANTHROPIC_TOOLS,
)

# Gemini
genai_client = instructor.from_gemini(
    client=genai.GenerativeModel("gemini-1.5-flash"),
    mode=instructor.Mode.GEMINI_JSON,
)

The supported modes per provider, as of Instructor 1.x:

  • OpenAI: TOOLS (default), TOOLS_STRICT, JSON, MD_JSON, PARALLEL_TOOLS
  • Anthropic: ANTHROPIC_TOOLS, ANTHROPIC_JSON
  • Gemini: GEMINI_JSON, GEMINI_TOOLS
  • Cohere / Mistral / Groq / Ollama: each has its own — check the docs

Pro Tip: Use TOOLS_STRICT on OpenAI when you need structured output to exactly match the schema. It enables OpenAI’s strict schema mode and eliminates a whole class of validation retries — at the cost of slightly higher latency.

Fix 2: Inspect the Last Validation Error, Don’t Just Raise max_retries

InstructorRetryException contains the last attempt’s exception. Use it to see what the model actually returned:

try:
    user = client.chat.completions.create(
        model="gpt-4o-mini",
        response_model=User,
        messages=[...],
        max_retries=3,
    )
except instructor.exceptions.InstructorRetryException as e:
    print("Attempts:", e.n_attempts)
    print("Last completion:", e.last_completion)
    print("Validation errors:", e.messages[-1])

Most of the time you’ll see the model returning a string (“unknown”, “not specified”) for a required int field, or refusing the question entirely. The fix is usually one of:

  • Add Optional[int] to the field
  • Add a Field(description=...) hint so the model knows what you want
  • Make the prompt explicit (“Return age as a number, or null if unknown”)
from typing import Optional
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str = Field(description="Full name of the person")
    age: Optional[int] = Field(default=None, description="Age in years, null if unknown")

Fix 3: Validate Computed Fields with @field_validator

Instructor re-prompts on Pydantic validation errors, so your validators become part of the LLM correction loop. This is the killer feature — use it:

from pydantic import BaseModel, field_validator

class Answer(BaseModel):
    answer: str
    citations: list[str]

    @field_validator("citations")
    @classmethod
    def must_have_citations(cls, v):
        if len(v) < 1:
            raise ValueError("Answer must include at least one citation URL.")
        return v

If the model returns an answer with no citations, Instructor automatically asks it again with the validator’s error message. After a few iterations you usually get a valid response without writing any orchestration code.

Common Mistake: Using raise ValueError("invalid") with no useful message. The error text is what the LLM sees on retry — write it as if you’re telling a junior dev what’s wrong. “Must include at least one URL starting with https://” beats “invalid.”

Fix 4: Streaming Partials and Iterables

Partial streaming yields a model where each field arrives as the tokens stream in. Early yields have None for fields the model hasn’t produced yet — that’s expected, not a bug:

from instructor import Partial

stream = client.chat.completions.create_partial(
    model="gpt-4o-mini",
    response_model=User,
    messages=[...],
)

for partial in stream:
    # partial.name fills in first, partial.age later
    print(partial.model_dump())

For a list of items where you want each item complete before yielding, use create_iterable:

from typing import Iterable

class City(BaseModel):
    name: str
    country: str

cities = client.chat.completions.create_iterable(
    model="gpt-4o-mini",
    response_model=City,  # singular — Instructor handles the list shape
    messages=[{"role": "user", "content": "List 5 European capitals"}],
)

for city in cities:
    print(city.name, city.country)

Don’t pass list[City] as response_model for streaming — pass the element type and use create_iterable.

Fix 5: Async Clients

Use the async constructors and await the call:

import asyncio
import instructor
from openai import AsyncOpenAI

aclient = instructor.from_openai(AsyncOpenAI())

async def main():
    user = await aclient.chat.completions.create(
        model="gpt-4o-mini",
        response_model=User,
        messages=[...],
    )
    print(user)

asyncio.run(main())

For async iteration over partials or items:

async for partial in aclient.chat.completions.create_partial(...):
    print(partial)

Note: Don’t mix sync and async clients. instructor.from_openai(OpenAI()) returns a sync wrapper; await client.chat.completions.create(...) on that raises TypeError: object ... can't be used in 'await' expression.

Fix 6: Track Token Usage Without Losing the Validated Object

When you call client.chat.completions.create(...), you get back the parsed Pydantic model — the raw response with usage info is gone. To get both, use create_with_completion:

user, completion = client.chat.completions.create_with_completion(
    model="gpt-4o-mini",
    response_model=User,
    messages=[...],
)

print(user.name)
print("Tokens used:", completion.usage.total_tokens)

This returns a tuple: your validated model plus the raw provider response (with usage, id, system_fingerprint, etc.).

Fix 7: OpenAI Client Version

Instructor 1.x requires openai>=1.0. If you’re seeing 'OpenAI' object has no attribute 'chat' or similar attribute errors after the from_openai patch, you’re probably on the old SDK:

pip install -U "openai>=1.40" "instructor>=1.4"

Pin them together in pyproject.toml so a future openai minor bump doesn’t break your patching:

[project]
dependencies = [
    "instructor>=1.4,<2.0",
    "openai>=1.40,<2.0",
]

If you can’t upgrade the OpenAI SDK, pin to an older Instructor that supports openai<1.0 (versions before 1.0 — but you really should upgrade).

Fix 8: Pydantic v1 Models

Instructor 1.x requires Pydantic v2. If you still have v1 models lying around:

from pydantic import BaseModel  # v2

# Won't work:
# from pydantic.v1 import BaseModel

Common v1→v2 changes that bite Instructor users:

  • Config class → model_config dict
  • validatorfield_validator (with @classmethod)
  • parse_objmodel_validate
  • .dict().model_dump()
  • Field(..., regex=...)Field(..., pattern=...)

Run bump-pydantic on your codebase to migrate the obvious cases automatically.

Still Not Working?

A few less-common failures:

  • response_model=str doesn’t work. Instructor expects a Pydantic model. Wrap primitives: class Result(BaseModel): value: str.
  • Anthropic returns <thinking> blocks in your strings. Set the system prompt to forbid them, or use Mode.ANTHROPIC_JSON which is stricter about output format.
  • max_retries doesn’t seem to apply. Pass it explicitly on the call (max_retries=3) rather than relying on defaults, which have changed between versions. For backoff, pass a tenacity.Retrying instance instead of an int.
  • Cost spikes after enabling retries. Each retry is a full chat completion. Cap with max_retries=3 and prefer schema fixes over higher retry counts.
  • InstructorRetryException on a valid-looking response. Print e.last_completion.choices[0].message — the model may be wrapping JSON in markdown fences. Switch to Mode.MD_JSON or Mode.TOOLS_STRICT.
  • Optional[Foo] field becomes a string "None". The model is hallucinating the literal string. Tighten the field description: “Return null if unknown — not the string ‘None’.”
  • Local Ollama / vLLM gives empty objects. Smaller open models often can’t follow tool-use schemas reliably. Use Mode.JSON or Mode.MD_JSON with a stricter prompt, and validate aggressively.
  • from_anthropic raises BadRequestError: tool_use_id. You’re sending an Anthropic message that mixes old/new tool formats. Reset the conversation or use Instructor’s helpers instead of building messages by hand.

For related Pydantic and LLM client issues, see Pydantic validation error, OpenAI API not working, LangChain Python not working, and Ollama not working.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles