Fix: Instructor Not Working — Validation Loops, Mode Mismatch, Streaming, and Anthropic / Gemini Issues
Quick Answer
How to fix Python Instructor errors — ValidationError loops, max_retries exhausted, mode=Mode.TOOLS vs JSON, partial streaming type errors, Anthropic and Gemini client patching, token usage tracking.
The Error
You call an Instructor-patched OpenAI client and the request keeps retrying until it dies:
import instructor
from openai import OpenAI
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = instructor.from_openai(OpenAI())
user = client.chat.completions.create(
model="gpt-4o-mini",
response_model=User,
messages=[{"role": "user", "content": "Tell me about Alice"}],
max_retries=3,
)
# instructor.exceptions.InstructorRetryException:
# 1 validation error for User
# age
# Input should be a valid integer [type=int_type, input_value='unknown']Or you switch to Anthropic and it complains about the mode:
ValueError: Mode TOOLS is not supported by Anthropic. Use Mode.ANTHROPIC_TOOLS or Mode.ANTHROPIC_JSON.Or your partial streaming returns objects with None everywhere:
for partial in client.chat.completions.create_partial(...):
print(partial)
# User(name=None, age=None)
# User(name=None, age=None)Or you upgraded openai and now nothing patches:
AttributeError: 'OpenAI' object has no attribute 'chat'Why This Happens
Instructor wraps an LLM client and re-prompts the model when its output fails Pydantic validation. The wrapping happens in three layers, and most failures map to one of them:
- Mode mismatch — Instructor uses different prompting strategies depending on what the underlying API supports. OpenAI defaults to
Mode.TOOLS(function calling). Anthropic needsMode.ANTHROPIC_TOOLS. Gemini needsMode.GEMINI_JSON. Passing the wrong mode for the provider produces silent garbage or hard errors. - Validation loop exhaustion — If your
response_modelis too strict (required field the model can’t infer) or your prompt doesn’t ask for the data, every retry fails and you hitmax_retries. The library does the right thing — it surfaces the last validation error so you can fix the prompt or model. - Client version skew —
instructor.from_openai()andinstructor.from_anthropic()expect specific client API shapes. Mixing an oldopenai<1.0client with new Instructor breaks the patching.
The streaming None issue is by design but trips everyone the first time: partials yield as fields arrive, so early partials genuinely don’t have most fields set yet.
Fix 1: Pick the Right Mode for Your Provider
Each provider has supported modes. Use the helper instead of guessing:
import instructor
from openai import OpenAI
from anthropic import Anthropic
import google.generativeai as genai
# OpenAI (default mode is TOOLS — usually correct)
openai_client = instructor.from_openai(OpenAI())
# Anthropic — must specify a supported mode
anthropic_client = instructor.from_anthropic(
Anthropic(),
mode=instructor.Mode.ANTHROPIC_TOOLS,
)
# Gemini
genai_client = instructor.from_gemini(
client=genai.GenerativeModel("gemini-1.5-flash"),
mode=instructor.Mode.GEMINI_JSON,
)The supported modes per provider, as of Instructor 1.x:
- OpenAI:
TOOLS(default),TOOLS_STRICT,JSON,MD_JSON,PARALLEL_TOOLS - Anthropic:
ANTHROPIC_TOOLS,ANTHROPIC_JSON - Gemini:
GEMINI_JSON,GEMINI_TOOLS - Cohere / Mistral / Groq / Ollama: each has its own — check the docs
Pro Tip: Use TOOLS_STRICT on OpenAI when you need structured output to exactly match the schema. It enables OpenAI’s strict schema mode and eliminates a whole class of validation retries — at the cost of slightly higher latency.
Fix 2: Inspect the Last Validation Error, Don’t Just Raise max_retries
InstructorRetryException contains the last attempt’s exception. Use it to see what the model actually returned:
try:
user = client.chat.completions.create(
model="gpt-4o-mini",
response_model=User,
messages=[...],
max_retries=3,
)
except instructor.exceptions.InstructorRetryException as e:
print("Attempts:", e.n_attempts)
print("Last completion:", e.last_completion)
print("Validation errors:", e.messages[-1])Most of the time you’ll see the model returning a string (“unknown”, “not specified”) for a required int field, or refusing the question entirely. The fix is usually one of:
- Add
Optional[int]to the field - Add a
Field(description=...)hint so the model knows what you want - Make the prompt explicit (“Return age as a number, or null if unknown”)
from typing import Optional
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(description="Full name of the person")
age: Optional[int] = Field(default=None, description="Age in years, null if unknown")Fix 3: Validate Computed Fields with @field_validator
Instructor re-prompts on Pydantic validation errors, so your validators become part of the LLM correction loop. This is the killer feature — use it:
from pydantic import BaseModel, field_validator
class Answer(BaseModel):
answer: str
citations: list[str]
@field_validator("citations")
@classmethod
def must_have_citations(cls, v):
if len(v) < 1:
raise ValueError("Answer must include at least one citation URL.")
return vIf the model returns an answer with no citations, Instructor automatically asks it again with the validator’s error message. After a few iterations you usually get a valid response without writing any orchestration code.
Common Mistake: Using raise ValueError("invalid") with no useful message. The error text is what the LLM sees on retry — write it as if you’re telling a junior dev what’s wrong. “Must include at least one URL starting with https://” beats “invalid.”
Fix 4: Streaming Partials and Iterables
Partial streaming yields a model where each field arrives as the tokens stream in. Early yields have None for fields the model hasn’t produced yet — that’s expected, not a bug:
from instructor import Partial
stream = client.chat.completions.create_partial(
model="gpt-4o-mini",
response_model=User,
messages=[...],
)
for partial in stream:
# partial.name fills in first, partial.age later
print(partial.model_dump())For a list of items where you want each item complete before yielding, use create_iterable:
from typing import Iterable
class City(BaseModel):
name: str
country: str
cities = client.chat.completions.create_iterable(
model="gpt-4o-mini",
response_model=City, # singular — Instructor handles the list shape
messages=[{"role": "user", "content": "List 5 European capitals"}],
)
for city in cities:
print(city.name, city.country)Don’t pass list[City] as response_model for streaming — pass the element type and use create_iterable.
Fix 5: Async Clients
Use the async constructors and await the call:
import asyncio
import instructor
from openai import AsyncOpenAI
aclient = instructor.from_openai(AsyncOpenAI())
async def main():
user = await aclient.chat.completions.create(
model="gpt-4o-mini",
response_model=User,
messages=[...],
)
print(user)
asyncio.run(main())For async iteration over partials or items:
async for partial in aclient.chat.completions.create_partial(...):
print(partial)Note: Don’t mix sync and async clients. instructor.from_openai(OpenAI()) returns a sync wrapper; await client.chat.completions.create(...) on that raises TypeError: object ... can't be used in 'await' expression.
Fix 6: Track Token Usage Without Losing the Validated Object
When you call client.chat.completions.create(...), you get back the parsed Pydantic model — the raw response with usage info is gone. To get both, use create_with_completion:
user, completion = client.chat.completions.create_with_completion(
model="gpt-4o-mini",
response_model=User,
messages=[...],
)
print(user.name)
print("Tokens used:", completion.usage.total_tokens)This returns a tuple: your validated model plus the raw provider response (with usage, id, system_fingerprint, etc.).
Fix 7: OpenAI Client Version
Instructor 1.x requires openai>=1.0. If you’re seeing 'OpenAI' object has no attribute 'chat' or similar attribute errors after the from_openai patch, you’re probably on the old SDK:
pip install -U "openai>=1.40" "instructor>=1.4"Pin them together in pyproject.toml so a future openai minor bump doesn’t break your patching:
[project]
dependencies = [
"instructor>=1.4,<2.0",
"openai>=1.40,<2.0",
]If you can’t upgrade the OpenAI SDK, pin to an older Instructor that supports openai<1.0 (versions before 1.0 — but you really should upgrade).
Fix 8: Pydantic v1 Models
Instructor 1.x requires Pydantic v2. If you still have v1 models lying around:
from pydantic import BaseModel # v2
# Won't work:
# from pydantic.v1 import BaseModelCommon v1→v2 changes that bite Instructor users:
Configclass →model_configdictvalidator→field_validator(with@classmethod)parse_obj→model_validate.dict()→.model_dump()Field(..., regex=...)→Field(..., pattern=...)
Run bump-pydantic on your codebase to migrate the obvious cases automatically.
Still Not Working?
A few less-common failures:
response_model=strdoesn’t work. Instructor expects a Pydantic model. Wrap primitives:class Result(BaseModel): value: str.- Anthropic returns
<thinking>blocks in your strings. Set the system prompt to forbid them, or useMode.ANTHROPIC_JSONwhich is stricter about output format. max_retriesdoesn’t seem to apply. Pass it explicitly on the call (max_retries=3) rather than relying on defaults, which have changed between versions. For backoff, pass atenacity.Retryinginstance instead of an int.- Cost spikes after enabling retries. Each retry is a full chat completion. Cap with
max_retries=3and prefer schema fixes over higher retry counts. InstructorRetryExceptionon a valid-looking response. Printe.last_completion.choices[0].message— the model may be wrapping JSON in markdown fences. Switch toMode.MD_JSONorMode.TOOLS_STRICT.Optional[Foo]field becomes a string"None". The model is hallucinating the literal string. Tighten the field description: “Return null if unknown — not the string ‘None’.”- Local Ollama / vLLM gives empty objects. Smaller open models often can’t follow tool-use schemas reliably. Use
Mode.JSONorMode.MD_JSONwith a stricter prompt, and validate aggressively. from_anthropicraisesBadRequestError: tool_use_id. You’re sending an Anthropic message that mixes old/new tool formats. Reset the conversation or use Instructor’s helpers instead of building messages by hand.
For related Pydantic and LLM client issues, see Pydantic validation error, OpenAI API not working, LangChain Python not working, and Ollama not working.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: LiteLLM Not Working — Model Name Format, API Keys, Streaming, and Fallback Errors
How to fix LiteLLM errors — BadRequestError model not found, missing API key env vars, streaming chunk differences, fallback model not triggering, async drop_params, and proxy server 401.
Fix: Outlines Not Working — Backend Setup, Pydantic Schemas, Regex, Choice, and Slow Sampling
How to fix Python Outlines errors — model backend missing, JSON schema vs Pydantic, regex pattern compilation slow, choice list timing, vLLM/Transformers/Ollama wiring, and streaming structured outputs.
Fix: DSPy Not Working — LM Configuration, Signatures, Modules, Optimizers, and Cache Surprises
How to fix DSPy errors — no LM configured, signature field types, ChainOfThought vs Predict, optimizer (MIPROv2) setup, retrieval module wiring, async usage, and cache invalidation between runs.
Fix: Langfuse Not Working — SDK Init, Tracing Generations, LangChain Wrapper, and Self-Hosted Setup
How to fix Langfuse errors — Python/JS SDK init, trace/span/generation hierarchy, LangChain CallbackHandler, OpenAI wrapper, missing usage/cost data, prompt management, and self-hosted Postgres setup.