Fix: NATS Not Working — Connection Auth, JetStream Streams, Consumer Ack, and Subject Wildcards
Quick Answer
How to fix NATS errors — no responders to request, JetStream stream not found, consumer redelivery loop, durable vs ephemeral consumers, subject wildcard mismatch, TLS auth setup, and KV bucket basics.
The Error
You call nats.request and get this:
nats: no responders available for requestOr your JetStream publish fails because the stream doesn’t exist:
nats: stream not found: ORDERSOr the consumer keeps redelivering the same message:
[consumer] received msg seq=1
[consumer] received msg seq=1 ← same one again, 30 seconds later
[consumer] received msg seq=1Or a wildcard subscription matches nothing:
await nc.subscribe("orders.*.processed", cb=handle)
# Publishing to "orders.us.east.processed" → no delivery.Why This Happens
NATS has two layers and most issues come from mixing them up:
- Core NATS — fire-and-forget pub/sub and request/reply. No persistence. A
requestwithout a subscriber gets “no responders.” Subscribers come and go; nothing stored. - JetStream — persistent messaging built on top. Streams capture messages by subject pattern. Consumers (durable or ephemeral) read from streams with ack semantics.
Other common pitfalls:
- Subject wildcards.
*matches one token;>matches one or more.orders.*matchesorders.abut notorders.a.b.orders.>matches both. - Ack timeouts. A consumer that doesn’t ack within
AckWait(default 30s) gets the message redelivered. Long-running handlers must extend the deadline or ack early. - Durable vs ephemeral consumers. Durable consumers survive client restarts and resume from their last acked position. Ephemerals are tied to a connection and disappear with it.
- TLS and auth. Bare
nats://...is unencrypted and unauthenticated. Production needstls://ornats://user:pass@...or NKeys/JWT.
Fix 1: “No Responders” Means No Subscriber
Request/reply needs at least one subscriber:
# Subscriber side:
import asyncio
import nats
async def main():
nc = await nats.connect("nats://localhost:4222")
async def handler(msg):
await msg.respond(b"pong")
await nc.subscribe("ping", cb=handler)
await asyncio.sleep(3600) # Stay alive
asyncio.run(main())# Requester side:
nc = await nats.connect("nats://localhost:4222")
try:
response = await nc.request("ping", b"", timeout=1.0)
print(response.data)
except nats.errors.NoRespondersError:
print("No subscriber on 'ping'")Check both sides are connected to the same NATS server and using the same subject. From a NATS CLI:
nats sub "ping" # Listen
nats req "ping" "" # SendPro Tip: For mission-critical request/reply patterns, never rely on a single responder being online. Use JetStream with consumer pulls so messages queue when no one is listening.
Fix 2: Create the JetStream Stream Before Publishing
JetStream needs explicit stream creation:
import nats
from nats.js.api import StreamConfig, RetentionPolicy
async def setup():
nc = await nats.connect("nats://localhost:4222")
js = nc.jetstream()
await js.add_stream(
name="ORDERS",
subjects=["orders.>"],
retention=RetentionPolicy.LIMITS,
max_msgs=1_000_000,
max_bytes=10 * 1024**3, # 10 GB
max_age=7 * 24 * 3600, # 7 days, in seconds
)Or via CLI (idempotent — safe to re-run):
nats stream add ORDERS \
--subjects "orders.>" \
--retention limits \
--max-msgs 1000000 \
--max-age 7d \
--storage fileSubjects use > to capture everything under orders.. Publishing to orders.us.placed, orders.eu.shipped, etc. all land in this stream.
Common Mistake: Adding a stream with subjects orders (no wildcard) and wondering why orders.us.placed doesn’t get captured. Use orders.> for “everything under orders.”
Fix 3: Use Durable Consumers for Reliable Processing
Durable consumers survive restarts. Define one with explicit ack:
from nats.js.api import ConsumerConfig, AckPolicy, DeliverPolicy
await js.add_consumer(
"ORDERS",
config=ConsumerConfig(
durable_name="ORDER_PROCESSOR",
ack_policy=AckPolicy.EXPLICIT,
deliver_policy=DeliverPolicy.ALL,
max_deliver=5,
ack_wait=60, # seconds
),
)Then pull or push:
# Pull-based (recommended for backpressure control):
sub = await js.pull_subscribe("orders.>", "ORDER_PROCESSOR", stream="ORDERS")
while True:
try:
msgs = await sub.fetch(batch=10, timeout=5)
for msg in msgs:
try:
process(msg.data)
await msg.ack()
except TransientError:
await msg.nak(delay=5) # Retry in 5s
except PermanentError:
await msg.term() # Don't retry
except asyncio.TimeoutError:
continueThree ack outcomes:
ack()— successfully processed. Won’t be redelivered.nak(delay=N)— failed transiently. Redeliver after N seconds.term()— failed permanently. Don’t redeliver. (Counts againstmax_deliverdifferently — terminated messages move to the consumer’s discard count.)in_progress()— still working. Extends the ack deadline.
Pro Tip: For handlers that take longer than ack_wait, call msg.in_progress() periodically to extend the deadline. Otherwise NATS redelivers thinking the handler died.
Fix 4: Subject Wildcards — * vs >
Two wildcard tokens:
*matches exactly one token.orders.*.placedmatchesorders.us.placedbut notorders.us.east.placed.>matches one or more tokens, only at the end.orders.>matchesorders.us,orders.us.east,orders.us.east.placed, etc.
# Match all order events from any region (one-deep):
await nc.subscribe("orders.*.placed", cb=...)
# Matches: orders.us.placed, orders.eu.placed
# Doesn't match: orders.us.east.placed
# Match every order event:
await nc.subscribe("orders.>", cb=...)
# Matches all of the above.For JetStream stream subjects, use the same wildcard rules. A stream with subjects=["orders.>"] captures everything under orders..
Common Mistake: Trying to use > in the middle: orders.>.placed is invalid. > must be the final token.
Fix 5: Connection Auth and TLS
Bare nats:// connections work for local dev but are unencrypted and unauthenticated. For production:
User/password:
nc = await nats.connect(
"nats://user:[email protected]:4222",
)Token:
nc = await nats.connect(
"nats://nats.example.com:4222",
token="s3cr3t",
)NKey + JWT (recommended for prod):
nc = await nats.connect(
"tls://nats.example.com:4222",
user_credentials="./user.creds",
)user.creds is the file you get from nsc generate creds. It contains both the NKey seed and the signed JWT.
TLS:
import ssl
ctx = ssl.create_default_context()
nc = await nats.connect(
"tls://nats.example.com:4222",
tls=ctx,
)For mutual TLS (client cert auth):
ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH, cafile="ca.pem")
ctx.load_cert_chain(certfile="client.pem", keyfile="client.key")
nc = await nats.connect("tls://nats.example.com:4222", tls=ctx)Fix 6: Key-Value (KV) Buckets
JetStream KV is a key-value store built on streams. Common API:
js = nc.jetstream()
# Create / open a bucket:
kv = await js.create_key_value(bucket="config", history=5, ttl=3600)
# Set / get:
await kv.put("api.endpoint", b"https://api.example.com")
entry = await kv.get("api.endpoint")
print(entry.value) # b"https://api.example.com"
# Watch for changes:
async for entry in await kv.watchall():
print(entry.key, entry.value)KV buckets are a thin layer over streams — under the hood, kv.put("key", value) is js.publish("$KV.config.key", value). Useful for:
- Feature flags
- Runtime config
- Service discovery
- Distributed locks (via CAS operations)
Note: KV is not a high-throughput cache. For per-request caching, use Redis. NATS KV shines for low-volume, consistency-sensitive state that multiple services need to watch.
Fix 7: Reconnection and max_reconnect_attempts
NATS clients auto-reconnect by default. Tune:
nc = await nats.connect(
servers=["nats://1.example.com:4222", "nats://2.example.com:4222"],
max_reconnect_attempts=-1, # Infinite
reconnect_time_wait=2.0, # 2s between attempts
error_cb=on_error,
closed_cb=on_closed,
reconnected_cb=on_reconnected,
)Pass multiple servers — the client probes them in order during initial connect and shuffles for reconnects.
Handle reconnection events:
async def on_disconnected():
print("disconnected")
async def on_reconnected():
print(f"reconnected to {nc.connected_url.netloc}")
nc = await nats.connect(
"nats://...",
disconnected_cb=on_disconnected,
reconnected_cb=on_reconnected,
)Common Mistake: Treating reconnects as errors. For long-lived clients, disconnect/reconnect is normal — handle it gracefully. Only closed_cb (terminal close, often after exhausting reconnects) is an actual failure.
Fix 8: Monitor With the NATS CLI
The nats CLI is invaluable for debugging:
# Server health:
nats server check connection
# Stream stats:
nats stream info ORDERS
# Consumer pending messages, redelivery counts:
nats consumer info ORDERS ORDER_PROCESSOR
# Tail a subject:
nats sub "orders.>"
# Publish a test message:
nats pub orders.us.placed '{"id":1}'
# Inspect KV bucket:
nats kv get config api.endpoint
nats kv ls configFor continuous monitoring, expose /varz, /connz, /jsz, /healthz endpoints from your NATS server:
curl http://localhost:8222/varz | jq
curl http://localhost:8222/jsz?accounts=true | jqThe /jsz endpoint shows JetStream usage per account, stream sizes, consumer lag — feed it to Prometheus via the prometheus-nats-exporter.
Still Not Working?
A few less-obvious failures:
- Messages published but never delivered. Subject mismatch. Use
nats sub ">"from CLI to see everything hitting the server, then check the actual subject your publisher uses. - Stream full and rejecting.
max_msgs/max_byteshit. Either purge old messages (nats stream purge ORDERS), expand limits, or switchretentiontoWORK_QUEUE(auto-removes acked messages). - Consumer lag grows unbounded. Processing is slower than ingestion. Add workers, batch with
fetch(batch=N), or use aWorkQueueconsumer that load-balances across instances. - Random “context deadline exceeded”. Default request timeout is short. Pass
timeout=explicitly for longer-running operations. - JetStream API works for one account but not another. Each NATS account has its own JetStream. Check the account context (
nsc list keys,nats context). max_deliverexhausted, messages disappear. They went to the consumer’s “discarded” count. Set up a--max-deliver-subjectto receive them as a dead-letter queue.- Server fills disk. JetStream is writing to disk. Check stream sizes and either reduce retention or add storage.
- Cross-region replication needed. Use NATS mirror or source streams, or a leaf node topology. These need explicit config — they’re not automatic.
For related messaging and queue issues, see RabbitMQ connection refused, Kafka not working, Redis pub sub not working, and Celery task not received.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Redis Streams Not Working — Consumer Groups, XACK, Pending Entries, MAXLEN, and Claiming
How to fix Redis Streams errors — XADD/XREAD basics, consumer group XGROUP CREATE, XACK for ack, XPENDING for stuck messages, MAXLEN ~ for trimming, XAUTOCLAIM for redelivery, and Cluster hash slot constraints.
Fix: Valkey Not Working — Redis Client Compatibility, ACL, Cluster Mode, and Migration
How to fix Valkey errors — client connection refused, RESP protocol compatibility, ACL user setup, cluster slot reshard, persistence config (RDB/AOF), TLS, Sentinel mode, and migrating from Redis.
Fix: ArgoCD Not Working — OutOfSync, Sync Waves, RBAC, Helm/Kustomize, and Webhook Setup
How to fix ArgoCD errors — application stuck OutOfSync, sync waves not respected, RBAC permission denied, Helm values not merged, ApplicationSet generator config, repo auth, and webhook not triggering.
Fix: Cloudflare Queues Not Working — Producer Binding, Consumer Worker, Batching, and Dead Letter
How to fix Cloudflare Queues errors — producer queue.send not delivering, consumer not invoking, ack/retry/DLQ patterns, batch size limits, max_retries, content type pitfalls, and local dev with wrangler.