Fix: Go Deadlock — all goroutines are asleep, deadlock!
Part of: Go, Rust & Systems Errors
Quick Answer
How to fix Go channel deadlocks — unbuffered vs buffered channels, missing goroutines, select statements, closing channels, sync primitives, and detecting deadlocks with go race detector.
The Problem
A Go program crashes with the deadlock error:
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan receive]:
main.main()
/app/main.go:12 +0x28
exit status 2Or a program hangs indefinitely without output:
func process(data []int) []int {
ch := make(chan int)
var wg sync.WaitGroup
for _, v := range data {
wg.Add(1)
go func(n int) {
ch <- n * 2 // Sends to channel
wg.Done()
}(v)
}
wg.Wait() // All goroutines done — but who reads from ch?
close(ch) // Too late — goroutines are blocked on ch <- (send)
var results []int
for v := range ch { // Never reached — deadlocked above
results = append(results, v)
}
return results
}Or a goroutine blocks forever waiting for a channel that’s never written to:
result := <-ch // Blocks forever if nothing sends to chWhy This Happens
A deadlock occurs when every goroutine in the program is blocked, each waiting on a channel or mutex operation that no other goroutine can satisfy. The Go runtime scans for this condition and panics rather than hanging silently, which is the all goroutines are asleep message you see in the stack trace. The detection is only triggered when the entire program is stuck — partial deadlocks where one goroutine still runs (a background timer, a network listener) escape the runtime check and surface later as goroutine leaks and memory growth.
The first conceptual hurdle is the difference between buffered and unbuffered channels. An unbuffered channel is a rendezvous point: the sender blocks until a receiver arrives, and vice versa. A buffered channel of size N lets the sender succeed without a receiver up to N times, then blocks. Most deadlocks come from assuming a channel will buffer when it won’t, or from setting up senders before any receiver exists. The second hurdle is ownership: every channel needs a clear sender side that closes it. If both ends try to close, you get a panic on the second close. If neither end closes, for range loops never terminate.
Common causes:
- Sending to an unbuffered channel with no receiver — unbuffered channels block the sender until a receiver is ready. If no goroutine is reading, the sender blocks forever.
- Reading from an empty channel that will never receive data — if the only writer closes without sending, the reader blocks.
wg.Wait()before starting the reader —wg.Wait()blocks until all goroutines finish. If goroutines block on sending to a channel that nobody reads,wg.Wait()never returns.- Circular channel dependency — goroutine A waits for goroutine B to send, goroutine B waits for goroutine A to send.
- Not closing a channel being ranged —
for v := range chblocks after the last item until the channel is closed. sync.Mutexlocked twice — callingLock()when you already hold the lock deadlocks (useRWMutexfor read-sharing or restructure locking).
In Production: Incident Lens
Channel deadlocks behave differently in production than in tests. In tests, the runtime catches the all-goroutines-asleep state and panics — you see the failure immediately. In production, the deadlocked goroutines are a tiny fraction of the goroutines in the process. Health checks pass. HTTP handlers serve requests. The runtime never trips its detector because thousands of other goroutines are still running. Instead, leaked goroutines accumulate, memory grows, and the service eventually OOMs or hits the scheduler limit.
Blast radius. The blast radius is “service memory grows until restart.” Each deadlocked goroutine holds its stack (8KB minimum) plus any captured variables. A handler that leaks one goroutine per request reaches gigabytes of leaked memory before the OOM killer fires. If the service is behind a load balancer with health checks that succeed on /health, the LB keeps sending traffic to the leaking instance until the kernel terminates the process.
Monitoring signal. Track go_goroutines from the Prometheus client library as a primary SLI. Healthy services hold a relatively stable goroutine count proportional to in-flight work. A monotonically rising goroutine count is a leak — almost always a deadlocked channel send or receive. Alert on the derivative: if goroutine count grows by more than X per minute for Y consecutive minutes, page on-call. Pair this with process_resident_memory_bytes — when goroutine count and RSS both climb in lockstep, you have a channel leak rather than a slow cache fill.
Recovery sequence. When goroutine count is climbing, send SIGQUIT to the process (or hit /debug/pprof/goroutine?debug=2) to dump every goroutine’s stack. Group by the line they’re blocked on — the offending channel operation is whichever line appears thousands of times. Restart the instance to free the memory. Then patch the leak. Without the stack dump, you have no way to localize which channel is stuck.
Postmortem preventive. Add goleak.VerifyTestMain(m) to every test package — it catches goroutines that outlive a test, which is the most reliable way to spot the bug at PR time. For services that handle long-lived connections, run periodic runtime.NumGoroutine() checks in load tests and fail the build if the count drifts upward.
Fix 1: Match Senders and Receivers
Every channel send needs a corresponding receive, either concurrent or buffered:
// DEADLOCK — unbuffered channel, send blocks, no concurrent receiver
func bad() {
ch := make(chan int)
ch <- 42 // Blocks — no one is receiving
fmt.Println(<-ch)
}
// FIX 1 — use a buffered channel (send doesn't block if buffer has space)
func fix1() {
ch := make(chan int, 1) // Buffer of 1
ch <- 42 // Doesn't block — buffered
fmt.Println(<-ch) // Reads from buffer
}
// FIX 2 — send from a goroutine (concurrent send + receive)
func fix2() {
ch := make(chan int)
go func() {
ch <- 42 // Goroutine blocks here until main receives
}()
fmt.Println(<-ch) // Unblocks the goroutine
}The fundamental rule: every unbuffered channel send must have a ready receiver.
Fix 2: Fix the Fan-Out / Collect Pattern
Collecting results from multiple goroutines is a common deadlock source:
// DEADLOCK — wg.Wait() blocks before reading from ch
// Goroutines are blocked on ch <- (no receiver), wg.Wait() never returns
func collectBad(data []int) []int {
ch := make(chan int)
var wg sync.WaitGroup
for _, v := range data {
wg.Add(1)
go func(n int) {
defer wg.Done()
ch <- n * 2 // BLOCKS — no one reading ch yet
}(v)
}
wg.Wait() // Never reached — goroutines stuck on send
close(ch)
var results []int
for v := range ch {
results = append(results, v)
}
return results
}
// FIX — close the channel AFTER wg.Wait() using a separate goroutine
func collectGood(data []int) []int {
ch := make(chan int, len(data)) // Buffered — senders don't block
var wg sync.WaitGroup
for _, v := range data {
wg.Add(1)
go func(n int) {
defer wg.Done()
ch <- n * 2 // Buffered — doesn't block
}(v)
}
// Wait for all sends, then close channel to signal collector
go func() {
wg.Wait()
close(ch) // Close after all sends complete
}()
var results []int
for v := range ch { // Reads until channel is closed
results = append(results, v)
}
return results
}Alternative — collect without a channel:
func collectWithMutex(data []int) []int {
var mu sync.Mutex
var results []int
var wg sync.WaitGroup
for _, v := range data {
wg.Add(1)
go func(n int) {
defer wg.Done()
result := n * 2
mu.Lock()
results = append(results, result)
mu.Unlock()
}(v)
}
wg.Wait()
return results
}Fix 3: Use select with a Default or Done Channel
select prevents blocking on a single channel operation:
// BLOCKS FOREVER if ch has no data and ctx is never cancelled
func bad(ch <-chan int, ctx context.Context) {
value := <-ch // Blocks indefinitely
}
// CORRECT — use select to handle multiple cases
func good(ch <-chan int, ctx context.Context) (int, bool) {
select {
case value := <-ch:
return value, true
case <-ctx.Done():
return 0, false // Context cancelled — stop waiting
case <-time.After(5 * time.Second):
return 0, false // Timeout
}
}
// Non-blocking send/receive with default
func nonBlockingSend(ch chan<- int, value int) bool {
select {
case ch <- value:
return true // Sent successfully
default:
return false // Channel full or no receiver — skip
}
}
func nonBlockingReceive(ch <-chan int) (int, bool) {
select {
case v := <-ch:
return v, true
default:
return 0, false // No data available
}
}Fix 4: Always Close Channels from the Sender
Channels should be closed by the sender (writer), not the receiver:
// DEADLOCK — channel never closed, range blocks forever
func producerBad() <-chan int {
ch := make(chan int)
go func() {
for i := 0; i < 5; i++ {
ch <- i
}
// MISSING: close(ch)
}()
return ch
}
func consumerBad() {
ch := producerBad()
for v := range ch { // Blocks after 5 items — channel never closed
fmt.Println(v)
}
}
// CORRECT — sender closes the channel when done
func producerGood() <-chan int {
ch := make(chan int)
go func() {
defer close(ch) // Always close when done sending
for i := 0; i < 5; i++ {
ch <- i
}
}()
return ch
}
// Reading from a closed channel returns zero value and false
v, ok := <-ch
if !ok {
// Channel is closed
}Don’t close a channel from the receiver — panics if sender tries to send after close:
// PANIC — sending to a closed channel panics
ch := make(chan int, 10)
close(ch)
ch <- 1 // panic: send on closed channelFix 5: Fix Mutex Deadlocks
sync.Mutex deadlocks when the same goroutine tries to lock it twice:
var mu sync.Mutex
// DEADLOCK — Lock() called twice in same goroutine
func bad() {
mu.Lock()
defer mu.Unlock()
anotherFunc() // Calls mu.Lock() — deadlock
}
func anotherFunc() {
mu.Lock() // Deadlock — already locked by bad()
defer mu.Unlock()
// ...
}
// FIX 1 — don't hold lock when calling functions that also lock
func good() {
mu.Lock()
localCopy := sharedData // Copy data while locked
mu.Unlock() // Release before calling other functions
anotherFunc(localCopy) // No lock held — anotherFunc can acquire it
}
// FIX 2 — restructure so mutex is only locked at one level
func goodAlternative() {
data := getDataWithoutLock() // No lock
mu.Lock()
defer mu.Unlock()
sharedData = processData(data) // Only hold lock for the write
}Detect lock order issues (AB-BA deadlock):
// POTENTIAL DEADLOCK — goroutine 1 locks A then B
// goroutine 2 locks B then A
var mutexA, mutexB sync.Mutex
// Goroutine 1
mutexA.Lock()
mutexB.Lock() // Waits for goroutine 2 to release B
mutexB.Unlock()
mutexA.Unlock()
// Goroutine 2 (concurrent)
mutexB.Lock()
mutexA.Lock() // Waits for goroutine 1 to release A → DEADLOCK
mutexA.Unlock()
mutexB.Unlock()
// FIX — always acquire locks in the same order
// Both goroutines: lock A first, then BFix 6: Detect Deadlocks with the Race Detector
Run your program or tests with the race detector — it catches data races that often lead to deadlocks:
# Run with race detector
go run -race main.go
go test -race ./...
# Build with race detector (for staging/canary deployments)
go build -race -o myappFor channel-specific deadlock debugging, add timeouts:
// Instead of blocking forever, add a timeout to identify the stuck operation
func withTimeout(fn func() error) error {
done := make(chan error, 1)
go func() {
done <- fn()
}()
select {
case err := <-done:
return err
case <-time.After(10 * time.Second):
// Dump goroutine stack traces to identify the deadlock
buf := make([]byte, 1<<20)
n := runtime.Stack(buf, true)
fmt.Printf("TIMEOUT — goroutine stacks:\n%s\n", buf[:n])
return errors.New("operation timed out")
}
}Print goroutine stacks on SIGQUIT:
# Send SIGQUIT to a running Go program to dump all goroutine stacks
kill -SIGQUIT <pid>
# Or in tests
go test -v -timeout 30s ./...
# -timeout causes go test to panic with stack dump after 30sFix 7: Channel Direction in Function Signatures
Using typed channel directions prevents accidental misuse:
// Bidirectional — any goroutine can send or receive
ch := make(chan int)
// Send-only — function can only send, not receive
func producer(ch chan<- int) {
ch <- 42
// <-ch // Compile error — can't receive on send-only channel
}
// Receive-only — function can only receive, not send
func consumer(ch <-chan int) {
v := <-ch
// ch <- 42 // Compile error — can't send on receive-only channel
}
// Pattern — pipeline
func generateNumbers(count int) <-chan int {
ch := make(chan int)
go func() {
defer close(ch)
for i := 0; i < count; i++ {
ch <- i
}
}()
return ch // Returns receive-only channel — caller can't accidentally send
}
func doubleValues(in <-chan int) <-chan int {
out := make(chan int)
go func() {
defer close(out)
for v := range in {
out <- v * 2
}
}()
return out
}
// Usage
numbers := generateNumbers(10)
doubled := doubleValues(numbers)
for v := range doubled {
fmt.Println(v)
}Still Not Working?
Deadlock outside main — Go’s deadlock detector only fires if ALL goroutines are blocked. If one goroutine is still running (e.g., a background timer), Go won’t detect the deadlock. Use goleak in tests to detect goroutine leaks:
go get go.uber.org/goleakfunc TestMain(m *testing.M) {
goleak.VerifyTestMain(m) // Fails if any goroutines leak after tests
}select with nil channels — a receive or send on a nil channel blocks forever. In select, a nil channel case is simply never selected (useful for disabling a case conditionally):
var ch chan int // nil channel
select {
case v := <-ch: // This case is never selected — nil channel blocks forever
fmt.Println(v)
case <-time.After(1 * time.Second):
fmt.Println("timeout")
}
// Prints "timeout" — nil channel in select is effectively disabledDeadlock in test code — Go tests run with a timeout (default 10 minutes). If a test deadlocks, it eventually times out with a goroutine dump. Use go test -timeout 10s to get faster feedback during debugging.
Channel send inside a defer that never runs — if a goroutine panics before reaching a defer ch <- result (or before a defer close(ch)), the receiver waits forever. Deferred sends only execute if the function returns normally or via a recovered panic. Send before risky work, or use recover() to ensure cleanup runs even on panic.
Goroutine blocked on a channel that escaped scope — if a goroutine captures a channel variable that’s no longer referenced anywhere else, the channel can’t be closed and the goroutine blocks indefinitely. Pass channels explicitly into goroutines and ensure at least one caller still holds the sender side.
select with a time.After leak — time.After(d) allocates a timer that survives until d elapses, even if the select already returned via another case. In tight loops with short durations, these accumulate and look like a goroutine leak in pprof. Use time.NewTimer with explicit timer.Stop() instead.
For related Go issues, see Fix: Go Goroutine Leak, Fix: Go Goroutine Deadlock, Fix: Go Context Deadline Exceeded, and Fix: Go Panic/Recover Patterns.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Go Concurrent Map Read and Write Panic — fatal error: concurrent map
How to fix Go's concurrent map read and write panic — using sync.RWMutex, sync.Map, atomic operations, and structuring code to avoid shared state.
Fix: Go Test Not Working — Tests Not Running, Failing Unexpectedly, or Coverage Not Collected
How to fix Go testing issues — test function naming, table-driven tests, t.Run subtests, httptest, testify assertions, and common go test flag errors.
Fix: Go Generics Type Constraint Error — Does Not Implement or Cannot Use as Type
How to fix Go generics errors — type constraints, interface vs constraint, comparable, union types, type inference failures, and common generic function pitfalls.
Fix: Go Error Handling Not Working — errors.Is, errors.As, and Wrapping
How to fix Go error handling — errors.Is vs ==, errors.As for type extraction, fmt.Errorf %w for wrapping, sentinel errors, custom error types, and stack traces.