Skip to content

Fix: NATS Not Working — Connection Auth, JetStream Streams, Consumer Ack, and Subject Wildcards

FixDevs ·

Quick Answer

How to fix NATS errors — no responders to request, JetStream stream not found, consumer redelivery loop, durable vs ephemeral consumers, subject wildcard mismatch, TLS auth setup, and KV bucket basics.

The Error

You call nats.request and get this:

nats: no responders available for request

Or your JetStream publish fails because the stream doesn’t exist:

nats: stream not found: ORDERS

Or the consumer keeps redelivering the same message:

[consumer] received msg seq=1
[consumer] received msg seq=1   ← same one again, 30 seconds later
[consumer] received msg seq=1

Or a wildcard subscription matches nothing:

await nc.subscribe("orders.*.processed", cb=handle)
# Publishing to "orders.us.east.processed" → no delivery.

Why This Happens

NATS has two layers and most issues come from mixing them up:

  • Core NATS — fire-and-forget pub/sub and request/reply. No persistence. A request without a subscriber gets “no responders.” Subscribers come and go; nothing stored.
  • JetStream — persistent messaging built on top. Streams capture messages by subject pattern. Consumers (durable or ephemeral) read from streams with ack semantics.

Other common pitfalls:

  • Subject wildcards. * matches one token; > matches one or more. orders.* matches orders.a but not orders.a.b. orders.> matches both.
  • Ack timeouts. A consumer that doesn’t ack within AckWait (default 30s) gets the message redelivered. Long-running handlers must extend the deadline or ack early.
  • Durable vs ephemeral consumers. Durable consumers survive client restarts and resume from their last acked position. Ephemerals are tied to a connection and disappear with it.
  • TLS and auth. Bare nats://... is unencrypted and unauthenticated. Production needs tls:// or nats://user:pass@... or NKeys/JWT.

Fix 1: “No Responders” Means No Subscriber

Request/reply needs at least one subscriber:

# Subscriber side:
import asyncio
import nats

async def main():
    nc = await nats.connect("nats://localhost:4222")
    
    async def handler(msg):
        await msg.respond(b"pong")
    
    await nc.subscribe("ping", cb=handler)
    await asyncio.sleep(3600)  # Stay alive

asyncio.run(main())
# Requester side:
nc = await nats.connect("nats://localhost:4222")
try:
    response = await nc.request("ping", b"", timeout=1.0)
    print(response.data)
except nats.errors.NoRespondersError:
    print("No subscriber on 'ping'")

Check both sides are connected to the same NATS server and using the same subject. From a NATS CLI:

nats sub "ping"          # Listen
nats req "ping" ""       # Send

Pro Tip: For mission-critical request/reply patterns, never rely on a single responder being online. Use JetStream with consumer pulls so messages queue when no one is listening.

Fix 2: Create the JetStream Stream Before Publishing

JetStream needs explicit stream creation:

import nats
from nats.js.api import StreamConfig, RetentionPolicy

async def setup():
    nc = await nats.connect("nats://localhost:4222")
    js = nc.jetstream()
    
    await js.add_stream(
        name="ORDERS",
        subjects=["orders.>"],
        retention=RetentionPolicy.LIMITS,
        max_msgs=1_000_000,
        max_bytes=10 * 1024**3,   # 10 GB
        max_age=7 * 24 * 3600,    # 7 days, in seconds
    )

Or via CLI (idempotent — safe to re-run):

nats stream add ORDERS \
    --subjects "orders.>" \
    --retention limits \
    --max-msgs 1000000 \
    --max-age 7d \
    --storage file

Subjects use > to capture everything under orders.. Publishing to orders.us.placed, orders.eu.shipped, etc. all land in this stream.

Common Mistake: Adding a stream with subjects orders (no wildcard) and wondering why orders.us.placed doesn’t get captured. Use orders.> for “everything under orders.”

Fix 3: Use Durable Consumers for Reliable Processing

Durable consumers survive restarts. Define one with explicit ack:

from nats.js.api import ConsumerConfig, AckPolicy, DeliverPolicy

await js.add_consumer(
    "ORDERS",
    config=ConsumerConfig(
        durable_name="ORDER_PROCESSOR",
        ack_policy=AckPolicy.EXPLICIT,
        deliver_policy=DeliverPolicy.ALL,
        max_deliver=5,
        ack_wait=60,  # seconds
    ),
)

Then pull or push:

# Pull-based (recommended for backpressure control):
sub = await js.pull_subscribe("orders.>", "ORDER_PROCESSOR", stream="ORDERS")

while True:
    try:
        msgs = await sub.fetch(batch=10, timeout=5)
        for msg in msgs:
            try:
                process(msg.data)
                await msg.ack()
            except TransientError:
                await msg.nak(delay=5)  # Retry in 5s
            except PermanentError:
                await msg.term()  # Don't retry
    except asyncio.TimeoutError:
        continue

Three ack outcomes:

  • ack() — successfully processed. Won’t be redelivered.
  • nak(delay=N) — failed transiently. Redeliver after N seconds.
  • term() — failed permanently. Don’t redeliver. (Counts against max_deliver differently — terminated messages move to the consumer’s discard count.)
  • in_progress() — still working. Extends the ack deadline.

Pro Tip: For handlers that take longer than ack_wait, call msg.in_progress() periodically to extend the deadline. Otherwise NATS redelivers thinking the handler died.

Fix 4: Subject Wildcards — * vs >

Two wildcard tokens:

  • * matches exactly one token. orders.*.placed matches orders.us.placed but not orders.us.east.placed.
  • > matches one or more tokens, only at the end. orders.> matches orders.us, orders.us.east, orders.us.east.placed, etc.
# Match all order events from any region (one-deep):
await nc.subscribe("orders.*.placed", cb=...)
# Matches: orders.us.placed, orders.eu.placed
# Doesn't match: orders.us.east.placed

# Match every order event:
await nc.subscribe("orders.>", cb=...)
# Matches all of the above.

For JetStream stream subjects, use the same wildcard rules. A stream with subjects=["orders.>"] captures everything under orders..

Common Mistake: Trying to use > in the middle: orders.>.placed is invalid. > must be the final token.

Fix 5: Connection Auth and TLS

Bare nats:// connections work for local dev but are unencrypted and unauthenticated. For production:

User/password:

nc = await nats.connect(
    "nats://user:[email protected]:4222",
)

Token:

nc = await nats.connect(
    "nats://nats.example.com:4222",
    token="s3cr3t",
)

NKey + JWT (recommended for prod):

nc = await nats.connect(
    "tls://nats.example.com:4222",
    user_credentials="./user.creds",
)

user.creds is the file you get from nsc generate creds. It contains both the NKey seed and the signed JWT.

TLS:

import ssl

ctx = ssl.create_default_context()
nc = await nats.connect(
    "tls://nats.example.com:4222",
    tls=ctx,
)

For mutual TLS (client cert auth):

ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH, cafile="ca.pem")
ctx.load_cert_chain(certfile="client.pem", keyfile="client.key")
nc = await nats.connect("tls://nats.example.com:4222", tls=ctx)

Fix 6: Key-Value (KV) Buckets

JetStream KV is a key-value store built on streams. Common API:

js = nc.jetstream()

# Create / open a bucket:
kv = await js.create_key_value(bucket="config", history=5, ttl=3600)

# Set / get:
await kv.put("api.endpoint", b"https://api.example.com")
entry = await kv.get("api.endpoint")
print(entry.value)  # b"https://api.example.com"

# Watch for changes:
async for entry in await kv.watchall():
    print(entry.key, entry.value)

KV buckets are a thin layer over streams — under the hood, kv.put("key", value) is js.publish("$KV.config.key", value). Useful for:

  • Feature flags
  • Runtime config
  • Service discovery
  • Distributed locks (via CAS operations)

Note: KV is not a high-throughput cache. For per-request caching, use Redis. NATS KV shines for low-volume, consistency-sensitive state that multiple services need to watch.

Fix 7: Reconnection and max_reconnect_attempts

NATS clients auto-reconnect by default. Tune:

nc = await nats.connect(
    servers=["nats://1.example.com:4222", "nats://2.example.com:4222"],
    max_reconnect_attempts=-1,  # Infinite
    reconnect_time_wait=2.0,    # 2s between attempts
    error_cb=on_error,
    closed_cb=on_closed,
    reconnected_cb=on_reconnected,
)

Pass multiple servers — the client probes them in order during initial connect and shuffles for reconnects.

Handle reconnection events:

async def on_disconnected():
    print("disconnected")

async def on_reconnected():
    print(f"reconnected to {nc.connected_url.netloc}")

nc = await nats.connect(
    "nats://...",
    disconnected_cb=on_disconnected,
    reconnected_cb=on_reconnected,
)

Common Mistake: Treating reconnects as errors. For long-lived clients, disconnect/reconnect is normal — handle it gracefully. Only closed_cb (terminal close, often after exhausting reconnects) is an actual failure.

Fix 8: Monitor With the NATS CLI

The nats CLI is invaluable for debugging:

# Server health:
nats server check connection

# Stream stats:
nats stream info ORDERS

# Consumer pending messages, redelivery counts:
nats consumer info ORDERS ORDER_PROCESSOR

# Tail a subject:
nats sub "orders.>"

# Publish a test message:
nats pub orders.us.placed '{"id":1}'

# Inspect KV bucket:
nats kv get config api.endpoint
nats kv ls config

For continuous monitoring, expose /varz, /connz, /jsz, /healthz endpoints from your NATS server:

curl http://localhost:8222/varz | jq
curl http://localhost:8222/jsz?accounts=true | jq

The /jsz endpoint shows JetStream usage per account, stream sizes, consumer lag — feed it to Prometheus via the prometheus-nats-exporter.

Still Not Working?

A few less-obvious failures:

  • Messages published but never delivered. Subject mismatch. Use nats sub ">" from CLI to see everything hitting the server, then check the actual subject your publisher uses.
  • Stream full and rejecting. max_msgs/max_bytes hit. Either purge old messages (nats stream purge ORDERS), expand limits, or switch retention to WORK_QUEUE (auto-removes acked messages).
  • Consumer lag grows unbounded. Processing is slower than ingestion. Add workers, batch with fetch(batch=N), or use a WorkQueue consumer that load-balances across instances.
  • Random “context deadline exceeded”. Default request timeout is short. Pass timeout= explicitly for longer-running operations.
  • JetStream API works for one account but not another. Each NATS account has its own JetStream. Check the account context (nsc list keys, nats context).
  • max_deliver exhausted, messages disappear. They went to the consumer’s “discarded” count. Set up a --max-deliver-subject to receive them as a dead-letter queue.
  • Server fills disk. JetStream is writing to disk. Check stream sizes and either reduce retention or add storage.
  • Cross-region replication needed. Use NATS mirror or source streams, or a leaf node topology. These need explicit config — they’re not automatic.

For related messaging and queue issues, see RabbitMQ connection refused, Kafka not working, Redis pub sub not working, and Celery task not received.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles