Fix: msgspec Not Working — Struct Definition, Type Validation, and JSON/MessagePack Encoding
Quick Answer
How to fix msgspec errors — Struct field type not supported, ValidationError on decode, msgspec vs Pydantic differences, custom type hooks, frozen Struct mutation, and JSON Schema generation.
The Error
You define a Struct and decoding fails with a type error:
import msgspec
class User(msgspec.Struct):
name: str
age: int
data = b'{"name": "Alice", "age": "30"}' # age is a string
user = msgspec.json.decode(data, type=User)
# msgspec.ValidationError: Expected `int`, got `str` - at `$.age`Or you try to encode a custom type:
from datetime import datetime
from uuid import UUID
class Event(msgspec.Struct):
id: UUID
timestamp: datetime
data: dict
event = Event(id=uuid4(), timestamp=datetime.now(), data={"foo": "bar"})
encoded = msgspec.json.encode(event)
# Works for datetime/UUID, but custom types failOr you mutate a frozen Struct:
class Config(msgspec.Struct, frozen=True):
debug: bool = False
config = Config(debug=True)
config.debug = False # AttributeError: cannot assign to fieldOr you want JSON Schema and it’s not where Pydantic puts it:
schema = User.model_json_schema() # AttributeError — that's Pydantic, not msgspecOr msgspec output doesn’t match Pydantic’s:
# Pydantic returns dicts as dicts
pydantic_user.model_dump() # {"name": "Alice", "age": 30}
# msgspec returns the Struct itself; you encode it
msgspec.json.encode(msgspec_user) # bytesmsgspec is a Rust-backed serialization library — 10-100x faster than Pydantic for JSON parsing, supports JSON and MessagePack, and provides strict type validation. It’s the fastest validation library for Python by a large margin. But its design favors performance over Pydantic’s flexibility, and the API takes some getting used to. This guide covers the common failure modes.
Why This Happens
msgspec uses class-level type annotations (like Pydantic / dataclasses) to define schemas, then generates highly optimized encode/decode routines at class creation time. The validation is strict by default — "30" doesn’t auto-coerce to 30 without explicit conversion hooks.
Unlike Pydantic, msgspec doesn’t bundle JSON Schema generation, ORM integration, or settings management. It does one thing — fast serialization with type validation — and does it exceptionally well.
Fix 1: Defining Structs
import msgspec
class User(msgspec.Struct):
name: str
age: int = 0 # Default
email: str | None = None # Optional via None defaultEncode to JSON:
user = User(name="Alice", age=30, email="[email protected]")
data = msgspec.json.encode(user)
# b'{"name":"Alice","age":30,"email":"[email protected]"}'
# Pretty-print (slower, debugging only)
import json
pretty = json.dumps(json.loads(data), indent=2)Decode from JSON:
data = b'{"name": "Alice", "age": 30}'
user = msgspec.json.decode(data, type=User)
print(user) # User(name='Alice', age=30, email=None)MessagePack (binary, smaller, faster):
data = msgspec.msgpack.encode(user)
# Binary representation — much smaller than JSON
user = msgspec.msgpack.decode(data, type=User)Struct options:
class User(msgspec.Struct, frozen=True, kw_only=True, rename="camel"):
user_name: str
user_age: int| Option | Effect |
|---|---|
frozen=True | Immutable after construction |
kw_only=True | All fields keyword-only at construction |
rename="camel" | Convert field names to camelCase in JSON |
rename="kebab" | Convert to kebab-case |
tag=True | Add discriminator tag for unions |
gc=False | Skip garbage collection (faster, but careful with refs) |
omit_defaults=True | Skip default values when encoding |
Common Mistake: Defining a Struct field as name: str = "" and assuming missing-from-input is fine. msgspec treats "" as a default — if the JSON omits the field, the empty string is used. If you need strict required-field validation, omit the default.
Fix 2: Type Validation Strictness
msgspec is strict by default — no coercion between basic types:
class User(msgspec.Struct):
age: int
msgspec.json.decode(b'{"age": 30}', type=User) # OK
msgspec.json.decode(b'{"age": "30"}', type=User) # ValidationError
msgspec.json.decode(b'{"age": 30.5}', type=User) # ValidationErrorFor more lenient parsing (auto-coerce), use the strict=False option:
user = msgspec.json.decode(b'{"age": "30"}', type=User, strict=False)
# Now coerces "30" to 30Or define accepted types as a Union:
class User(msgspec.Struct):
age: int | str # Accept either, you handle conversion
user = msgspec.json.decode(b'{"age": "30"}', type=User)
print(user.age) # "30" (still a string — no coercion)Pro Tip: Keep strict mode on by default. Lenient parsing (Pydantic’s default) silently accepts wrong types and converts them — easy to introduce bugs where wrong data flows through your system. Strict mode forces upstream code to send the right types, catching mistakes at the boundary.
Custom validation with __post_init__:
class Email(msgspec.Struct):
address: str
def __post_init__(self):
if "@" not in self.address:
raise ValueError(f"Invalid email: {self.address}")Or use Annotated types for declarative constraints:
from typing import Annotated
import msgspec
PositiveInt = Annotated[int, msgspec.Meta(gt=0)]
ShortStr = Annotated[str, msgspec.Meta(max_length=100)]
class User(msgspec.Struct):
age: PositiveInt
name: ShortStrAvailable Meta constraints:
| Constraint | Meaning |
|---|---|
gt, ge, lt, le | Numeric comparisons |
min_length, max_length | String length |
pattern | Regex match |
tz | Timezone-aware datetime required |
multiple_of | Number divisible by |
Fix 3: Custom Types and Encoding Hooks
msgspec handles built-in types automatically. For custom types:
import msgspec
from decimal import Decimal
from datetime import datetime
class Order(msgspec.Struct):
amount: Decimal # msgspec handles Decimal natively
# Encode
order = Order(amount=Decimal("19.99"))
data = msgspec.json.encode(order)
# b'{"amount":"19.99"}' # Decimal serialized as string for precision
# Decode
order = msgspec.json.decode(data, type=Order)
print(type(order.amount)) # <class 'decimal.Decimal'>Custom hooks for unknown types:
import msgspec
from pathlib import Path
def enc_hook(obj):
if isinstance(obj, Path):
return str(obj)
raise NotImplementedError(f"Cannot encode {type(obj)}")
def dec_hook(type, obj):
if type is Path:
return Path(obj)
raise NotImplementedError(f"Cannot decode {type}")
class Config(msgspec.Struct):
config_path: Path
config = Config(config_path=Path("/etc/app/config.yaml"))
data = msgspec.json.encode(config, enc_hook=enc_hook)
restored = msgspec.json.decode(data, type=Config, dec_hook=dec_hook)For reusable hooks, create an Encoder/Decoder once:
encoder = msgspec.json.Encoder(enc_hook=enc_hook)
decoder = msgspec.json.Decoder(Config, dec_hook=dec_hook)
data = encoder.encode(config)
restored = decoder.decode(data)Reusing encoders/decoders is faster than per-call encode()/decode() — they cache the schema introspection.
Common Mistake: Not raising NotImplementedError from hooks for unknown types. Without explicit raise, msgspec silently produces wrong output (or returns the type-name string). Always raise — if msgspec sees the raise, it knows to fall back to its default error.
Fix 4: Unions and Discriminators
import msgspec
class Cat(msgspec.Struct, tag="cat"):
name: str
indoor: bool
class Dog(msgspec.Struct, tag="dog"):
name: str
breed: str
Pet = Cat | Dog
pet = msgspec.json.decode(b'{"type": "cat", "name": "Whiskers", "indoor": true}', type=Pet)
print(type(pet)) # <class '__main__.Cat'>tag=True auto-generates a tag from the class name. tag="cat" uses an explicit string. The JSON must include "type": "cat" (or whatever the tag field is named) for msgspec to dispatch.
Custom tag field name:
class Cat(msgspec.Struct, tag="cat", tag_field="kind"):
name: str
# Decode: {"kind": "cat", "name": "Whiskers"}Untagged unions (msgspec tries each type in order):
Pet = Cat | Dog # Both classes have `tag` set
# Without tags — slower; msgspec parses against each in orderCommon Mistake: Forgetting to add tag= to all classes in a union. msgspec then tries each class without dispatch — works but slow and ambiguous when fields overlap. Always add tag=True or explicit tags to every class in a union.
Fix 5: Frozen and Hashable Structs
class Config(msgspec.Struct, frozen=True):
debug: bool = False
timeout: int = 30
config = Config(debug=True, timeout=60)
config.debug = False # AttributeError — frozen
# But you can hash and compare
configs = {Config(debug=True), Config(debug=True), Config(debug=False)}
print(len(configs)) # 2 — deduplicatedReplace fields immutably:
new_config = msgspec.structs.replace(config, timeout=120)
print(new_config) # Config(debug=True, timeout=120) — new instance
print(config) # Config(debug=True, timeout=60) — unchangedmsgspec.structs.replace is the immutable update pattern — like dataclasses’ replace or attrs’ evolve.
Convert Struct to dict:
import msgspec
user = User(name="Alice", age=30)
data = msgspec.structs.asdict(user)
# {'name': 'Alice', 'age': 30}
# Or tuple
tup = msgspec.structs.astuple(user)
# ('Alice', 30)These don’t trigger JSON encoding — pure Python conversions, useful for interop with dict-expecting code.
Fix 6: JSON Schema Generation
import msgspec
from typing import Annotated
class User(msgspec.Struct):
name: Annotated[str, msgspec.Meta(min_length=1, max_length=100)]
age: Annotated[int, msgspec.Meta(ge=0, le=150)]
email: str | None = None
schema = msgspec.json.schema(User)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string", "minLength": 1, "maxLength": 100},
# "age": {"type": "integer", "minimum": 0, "maximum": 150},
# "email": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null}
# },
# "required": ["name", "age"]
# }Components for OpenAPI:
schemas, components = msgspec.json.schema_components([User, Order, Item])
# schemas is a list of references; components is a $defs dictFor FastAPI integration that uses msgspec for body validation, see FastAPI dependency injection error.
Fix 7: Performance Optimization
msgspec is fast by default but tuning helps:
Reuse Encoder/Decoder objects for hot paths:
# WRONG — recreates encoder per call
def serialize(user):
return msgspec.json.encode(user)
# CORRECT — reused encoder
encoder = msgspec.json.Encoder()
def serialize(user):
return encoder.encode(user)For decoders, the speedup is more significant because they cache the schema lookups:
user_decoder = msgspec.json.Decoder(User)
def deserialize(data):
return user_decoder.decode(data)Disable garbage collection for short-lived data:
class Event(msgspec.Struct, gc=False):
timestamp: float
data: bytesgc=False skips registering the instance with Python’s garbage collector — faster construction. Only safe when the Struct doesn’t hold cyclical references.
MessagePack vs JSON:
# JSON — human-readable, web-standard, larger
json_data = msgspec.json.encode(user)
# MessagePack — binary, ~30-50% smaller, ~2x faster
msgpack_data = msgspec.msgpack.encode(user)For internal services, MessagePack is consistently better. For browser-facing APIs, JSON.
Pro Tip: msgspec is the fastest validator on Python by a large margin. Benchmark your actual workload — for high-throughput data pipelines (Kafka consumers, real-time analytics, ML inference servers), switching from Pydantic to msgspec often cuts validation latency by 10x and CPU usage by 5x. The cost is fewer features (no ORM integration, no settings management) — but for pure validation, msgspec wins.
Fix 8: Migration from Pydantic
If you’re moving from Pydantic to msgspec for performance:
# Pydantic
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(..., min_length=1)
age: int = Field(..., ge=0)
user = User(name="Alice", age=30)
user.model_dump() # {'name': 'Alice', 'age': 30}
data = user.model_dump_json() # '{"name": "Alice", "age": 30}'
# msgspec equivalent
import msgspec
from typing import Annotated
class User(msgspec.Struct):
name: Annotated[str, msgspec.Meta(min_length=1)]
age: Annotated[int, msgspec.Meta(ge=0)]
user = User(name="Alice", age=30)
data_dict = msgspec.structs.asdict(user) # Dict
data_bytes = msgspec.json.encode(user) # JSON bytesKey API differences:
| Pydantic | msgspec |
|---|---|
class X(BaseModel) | class X(msgspec.Struct) |
Field(...) | Annotated[T, msgspec.Meta(...)] |
model.model_dump() | msgspec.structs.asdict(model) |
model.model_dump_json() | msgspec.json.encode(model) |
X.model_validate(data) | msgspec.convert(data, X) |
X.model_validate_json(s) | msgspec.json.decode(s, type=X) |
X.model_json_schema() | msgspec.json.schema(X) |
@field_validator | __post_init__ or constraints |
Features Pydantic has that msgspec doesn’t:
- ORM mode (
from_attributes) - Settings management (use
pydantic-settings) - Complex validation logic across fields (multi-field validators)
- Discriminated unions with custom logic
- Plugin ecosystem
msgspec works best for: hot serialization paths, data ingestion pipelines, microservice payload parsing. Pydantic works best for: API surface validation, settings, complex business validation rules.
For Pydantic-specific patterns and comparing the two, see Pydantic validation error and Pydantic Settings not working.
Still Not Working?
msgspec with FastAPI
FastAPI uses Pydantic for request/response validation by default. To use msgspec:
from fastapi import FastAPI, Request
import msgspec
app = FastAPI()
user_decoder = msgspec.json.Decoder(User)
@app.post("/users")
async def create_user(request: Request):
body = await request.body()
user = user_decoder.decode(body)
# Process user
return msgspec.json.encode({"created": user.name})This bypasses FastAPI’s automatic validation but gives you msgspec’s speed. For maximum integration, the litestar web framework uses msgspec natively.
Custom JSON Types
class APIResponse(msgspec.Struct):
data: dict # Accepts any dict — no schema enforcement on contents
metadata: msgspec.Raw # Raw bytes — not parsed
response = msgspec.json.decode(b'{"data": {...}, "metadata": {...}}', type=APIResponse)
# response.metadata is raw bytes; parse separately if needed
print(response.metadata) # b'{"...": "..."}'msgspec.Raw is useful for passthrough — let the consumer decide how to parse the inner JSON.
Testing with msgspec
import pytest
import msgspec
@pytest.fixture
def user_decoder():
return msgspec.json.Decoder(User)
def test_decode_valid(user_decoder):
data = b'{"name": "Alice", "age": 30}'
user = user_decoder.decode(data)
assert user.name == "Alice"
def test_decode_invalid(user_decoder):
data = b'{"name": "Alice", "age": "thirty"}'
with pytest.raises(msgspec.ValidationError):
user_decoder.decode(data)For pytest fixture patterns with serialization, see pytest fixture not found.
Async Patterns
msgspec is synchronous (and fast enough that async wouldn’t add anything). In async code, just call directly:
async def handle_request(reader, writer):
data = await reader.read()
user = user_decoder.decode(data)
# ...For async code that does heavy serialization, msgspec is so fast that thread-pool offloading rarely helps.
Combining with attrs / dataclasses
msgspec Struct is similar to attrs but with built-in serialization. If you have existing attrs / dataclasses code:
# Convert attrs/dataclass instance to dict, then to msgspec Struct
import attrs
import msgspec
@attrs.define
class AttrsUser:
name: str
age: int
class MsgspecUser(msgspec.Struct):
name: str
age: int
attrs_user = AttrsUser(name="Alice", age=30)
data_dict = attrs.asdict(attrs_user)
msgspec_user = msgspec.convert(data_dict, MsgspecUser)For attrs-specific patterns, see attrs not working.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Marshmallow Not Working — Schema Errors, Load vs Dump, and Field Validation
How to fix Marshmallow errors — Schema not validated on dump, ValidationError messages format, unknown field handling, missing vs default, post_load object construction, and Marshmallow 3 to 4 migration.
Fix: Litestar Not Working — Dependency Injection, msgspec Validation, and Controller Setup
How to fix Litestar errors — Starlite to Litestar migration, Dependency injection scope, controller route not found, msgspec validation differs from Pydantic, lifespan handler setup, and OpenAPI generation.
Fix: Tortoise ORM Not Working — Model Registration, Async Init, and Relationship Errors
How to fix Tortoise ORM errors — Tortoise.init not called, no module imported model, fetch_related missing, aerich migration setup, FastAPI integration patterns, and ConfigurationError missing connection.
Fix: SQLModel Not Working — table=True Confusion, Relationship Loading, and Session Errors
How to fix SQLModel errors — table not created without table=True, relationship not eager-loaded MissingGreenlet, AttributeError on lazy attribute, mixing Pydantic and Table classes, Optional vs default None, and async session setup.