Skip to content

Fix: Kubernetes Pod OOMKilled — Out of Memory Error

FixDevs · (Updated: )

Part of:  Docker, DevOps & Infrastructure

Quick Answer

How to fix Kubernetes OOMKilled errors — understanding memory limits, finding memory leaks, setting correct resource requests and limits, and using Vertical Pod Autoscaler.

The Error

A Kubernetes pod terminates with OOMKilled (exit code 137):

$ kubectl get pods
NAME                    READY   STATUS      RESTARTS   AGE
api-deployment-7d9fb   0/1     OOMKilled   3          12m

Or in kubectl describe pod:

State:          Terminated
  Reason:       OOMKilled
  Exit Code:    137
  Started:      Thu, 20 Mar 2026 10:00:00 +0000
  Finished:     Thu, 20 Mar 2026 10:02:34 +0000
Last State:     Terminated
  Reason:       OOMKilled
  Exit Code:    137

Or the pod keeps restarting with increasing RESTARTS:

NAME               READY   STATUS             RESTARTS   AGE
worker-pod-xk2mp   0/1     CrashLoopBackOff   8          40m

Why This Happens

Kubernetes enforces memory limits at the container level using Linux cgroups. When a container’s memory usage exceeds its configured limits.memory, the kernel immediately terminates the process with SIGKILL (exit code 137).

The OOM killer is not a Kubernetes feature — it is a Linux kernel feature that Kubernetes leverages. Kubernetes sets up a cgroup for each container with a hard memory ceiling. When any process inside that cgroup (including child processes, shared memory segments, and tmpfs usage) pushes total memory past the limit, the kernel’s OOM killer selects the process with the highest oom_score_adj and kills it. In a container, there is typically only one main process, so it is always the one killed. The process receives SIGKILL (not SIGTERM), meaning it gets no opportunity to clean up, flush buffers, or log a final message. The exit code 137 is 128 + 9 (SIGKILL’s signal number).

There are two distinct scenarios that lead to OOMKilled. The first is a correctly functioning application that simply needs more memory than the limit allows. This happens after traffic spikes, when processing larger payloads than anticipated, or when loading datasets that grew since the limit was last tuned. The fix is straightforward: increase the limit. The second scenario is a memory leak — the application allocates memory incrementally and never releases it. Usage grows linearly over minutes or hours until it hits the ceiling. The pod restarts, memory starts low, then climbs again. The restart pattern is a strong signal of a leak: if OOMKilled happens at the same time interval after each restart (e.g., consistently 2 hours after start), the growth rate is constant and a leak is almost certain.

A third category is JVM and runtime misconfiguration. Java applications with -Xmx set higher than the container’s memory limit, or older JVMs (pre-Java 10) that read the host’s total RAM instead of the container’s cgroup limit, reserve heap memory that exceeds the container ceiling. The JVM allocates virtual memory up front, and the container is killed at startup before the application even serves a request. Node.js, Python, and Go each have their own memory awareness caveats within containers.

How Other Tools Handle This

Understanding how different container runtimes and orchestrators enforce memory limits helps diagnose OOMKilled across different environments and explains why behavior varies between them.

Docker’s --memory flag sets the same cgroup limit that Kubernetes uses, because both delegate to the Linux kernel. Running docker run --memory=512m is functionally equivalent to setting limits.memory: 512Mi in a Kubernetes pod spec. The difference is in what happens next. Docker does not restart the container by default when it is OOM-killed — it stops with exit code 137 and stays stopped. Kubernetes’ restartPolicy: Always (the default for Deployments) automatically restarts it, which creates the CrashLoopBackOff cycle. Docker also supports --memory-reservation (a soft limit equivalent to Kubernetes requests.memory) and --oom-kill-disable (which prevents the OOM killer entirely but risks freezing the host). Kubernetes does not expose an equivalent to --oom-kill-disable.

HashiCorp Nomad uses a similar model with memory (hard limit) and memory_max (optional oversubscription ceiling). When a task exceeds memory, Nomad kills it. When memory_max is set, the task can burst up to that value if the node has spare memory — a feature Kubernetes achieves through the gap between requests and limits. Nomad’s memory_max is more explicit about the intent and avoids the confusion of the Kubernetes requests/limits split.

Amazon ECS distinguishes between memoryReservation (soft limit) and memory (hard limit). The soft limit corresponds to Kubernetes requests; the hard limit corresponds to limits. If only memoryReservation is set, there is no hard ceiling — the container can use all available host memory. This makes ECS more permissive by default than Kubernetes, where limits are commonly set.

cgroups v1 vs cgroups v2 affects OOMKilled behavior at the kernel level. cgroups v1 (used in older Kubernetes clusters with kernel < 5.8) tracks memory per-container but has known accounting inaccuracies — kernel memory (slab, page tables) may not be fully counted, causing containers to use more memory than reported before OOMKilled fires. cgroups v2 (default in newer clusters) unifies memory accounting and includes kernel memory in the container’s usage, leading to more predictable OOMKilled behavior. If you see OOMKilled at memory usage well below the limit (according to kubectl top), the cluster likely runs cgroups v1 and kernel memory is the hidden consumer. Check with stat -fc %T /sys/fs/cgroup/ inside the container — cgroup2fs means v2, tmpfs means v1.

Runtime-specific container awareness:

RuntimeContainer-aware?Flag to setNotes
JVM 10+Yes (auto)-XX:MaxRAMPercentage=75.0Reads cgroup limit automatically
JVM 8u191+Yes (flag)-XX:+UseContainerSupportBackported from JVM 10
JVM < 8u191No-Xmx must be hardcodedReads host RAM, not container
Node.js 12+Partial--max-old-space-size=400Reads cgroup limit for defaults since v12.17, but explicit flag is safer
Python/CPythonNo--max-requests (gunicorn)No native cgroup awareness; control via worker recycling
GoYes (1.19+)GOMEMLIMIT=400MiBGOMEMLIMIT sets a soft GC target, not a hard cap

Fix 1: Diagnose Which Container Is Being Killed

First, identify the container and confirm it’s a memory issue:

# Check pod status and restart count
kubectl get pods -n <namespace>

# Describe the pod for detailed OOMKill information
kubectl describe pod <pod-name> -n <namespace>

# Check previous container logs (before the crash)
kubectl logs <pod-name> -n <namespace> --previous

# Check events for the pod
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name> --sort-by='.lastTimestamp'

In kubectl describe pod, look for:

Containers:
  api:
    State:          Running
    Last State:     Terminated
      Reason:       OOMKilled     # ← Confirmed OOM
      Exit Code:    137
    Limits:
      memory:       256Mi         # ← Current limit
    Requests:
      memory:       128Mi

Check node-level memory pressure:

# Check if the node is under memory pressure
kubectl describe node <node-name> | grep -A 5 "Conditions:"
# MemoryPressure: True means the node itself is running low

# Check actual memory usage on the node
kubectl top node <node-name>

# Check pod memory usage
kubectl top pods -n <namespace>
kubectl top pods -n <namespace> --containers  # Per-container breakdown

Fix 2: Increase Memory Limits

If the application legitimately needs more memory than currently allocated, increase the limit:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  template:
    spec:
      containers:
      - name: api
        image: myapp:latest
        resources:
          requests:
            memory: "256Mi"   # Minimum guaranteed memory
            cpu: "250m"
          limits:
            memory: "512Mi"   # Maximum — OOMKill fires if exceeded
            cpu: "500m"
kubectl apply -f deployment.yaml

Or patch the deployment directly:

kubectl patch deployment api -n <namespace> --patch '
{
  "spec": {
    "template": {
      "spec": {
        "containers": [{
          "name": "api",
          "resources": {
            "limits": {"memory": "512Mi"},
            "requests": {"memory": "256Mi"}
          }
        }]
      }
    }
  }
}'

Memory unit reference:

ValueMeaning
128Mi128 mebibytes (134 MB) — use Mi for binary
512Mi512 mebibytes (536 MB)
1Gi1 gibibyte (1.07 GB)
256M256 megabytes (decimal) — avoid, use Mi

Common Mistake: Setting limits.memory equal to requests.memory. The request guarantees the minimum; the limit is the maximum. A tight limit with no headroom causes OOMKill on any usage spike. A reasonable ratio is 2:1 (limit = 2x request) for stable apps, higher for bursty workloads.

Fix 3: Fix Java/JVM Memory Configuration

Java applications are a frequent cause of OOMKilled because the JVM doesn’t respect container memory limits by default in older versions:

# Wrong — JVM reads host total RAM, not container limit
# On a 16GB node with a 512Mi container limit, JVM sets heap to ~4GB
java -jar app.jar

# OOMKill fires when JVM tries to use the 4GB heap inside a 512Mi container

Fix for Java 11+ — use container-aware JVM flags:

# Dockerfile
FROM eclipse-temurin:21-jre

# UseContainerSupport is on by default in Java 10+
# MaxRAMPercentage controls heap as a fraction of container memory
ENTRYPOINT ["java", \
  "-XX:+UseContainerSupport", \
  "-XX:MaxRAMPercentage=75.0", \
  "-jar", "/app.jar"]

Or set explicit heap limits that fit within the container:

# If container limit is 512Mi, set heap to at most 400Mi
# (leaving room for JVM overhead, metaspace, etc.)
containers:
- name: api
  env:
  - name: JAVA_OPTS
    value: "-Xms128m -Xmx400m -XX:+UseContainerSupport"
  resources:
    limits:
      memory: "512Mi"

For Node.js — set --max-old-space-size:

containers:
- name: node-api
  command: ["node", "--max-old-space-size=400", "dist/index.js"]
  resources:
    limits:
      memory: "512Mi"

For Python — configure gunicorn worker memory:

containers:
- name: python-api
  command: ["gunicorn", "--workers=2", "--worker-class=uvicorn.workers.UvicornWorker",
            "--max-requests=1000", "--max-requests-jitter=50",
            "app:app"]

--max-requests restarts workers after N requests, preventing slow memory leaks from accumulating.

Fix 4: Find and Fix Memory Leaks

If memory grows continuously until OOMKill, it’s likely a leak. Profile memory usage before it crashes:

Enable memory profiling for Node.js:

// Add to your Node.js app
const v8 = require('v8');
const fs = require('fs');

// Trigger heap snapshot via HTTP endpoint
app.get('/debug/heap-snapshot', (req, res) => {
  const filename = `/tmp/heapdump-${Date.now()}.heapsnapshot`;
  const snapshotStream = v8.writeHeapSnapshot(filename);
  res.json({ snapshot: filename });
});

// Monitor heap size
setInterval(() => {
  const { heapUsed, heapTotal } = process.memoryUsage();
  console.log(`Heap: ${Math.round(heapUsed / 1024 / 1024)}MB / ${Math.round(heapTotal / 1024 / 1024)}MB`);
}, 30000);

For Python — use tracemalloc:

import tracemalloc
import linecache

def display_top(snapshot, key_type='lineno', limit=10):
    snapshot = snapshot.filter_traces((
        tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
        tracemalloc.Filter(False, "<unknown>"),
    ))
    top_stats = snapshot.statistics(key_type)
    for index, stat in enumerate(top_stats[:limit], 1):
        frame = stat.traceback[0]
        print(f"#{index}: {frame.filename}:{frame.lineno}: {stat.size / 1024:.1f} KiB")

tracemalloc.start()
# ... your code ...
snapshot = tracemalloc.take_snapshot()
display_top(snapshot)

Use kubectl exec to check memory inside a running container:

# Get shell in the container
kubectl exec -it <pod-name> -n <namespace> -- /bin/bash

# Check process memory
cat /proc/meminfo
cat /sys/fs/cgroup/memory/memory.usage_in_bytes     # Current usage (cgroups v1)
cat /sys/fs/cgroup/memory/memory.limit_in_bytes     # Container limit (cgroups v1)
cat /sys/fs/cgroup/memory.current                   # Current usage (cgroups v2)
cat /sys/fs/cgroup/memory.max                       # Container limit (cgroups v2)

Common leak patterns:

// Node.js — event listener leak
// WRONG — listener added on every request, never removed
app.get('/data', (req, res) => {
  emitter.on('data', handleData);  // Leak: listener accumulates
  emitter.emit('data', someData);
  res.send('ok');
});

// CORRECT — use once() or remove the listener
app.get('/data', (req, res) => {
  emitter.once('data', handleData);  // Fires once, auto-removed
  emitter.emit('data', someData);
  res.send('ok');
});
# Python — cache without eviction
# WRONG — unbounded cache grows forever
cache = {}
def get_data(key):
    if key not in cache:
        cache[key] = expensive_fetch(key)
    return cache[key]

# CORRECT — use LRU cache with size limit
from functools import lru_cache

@lru_cache(maxsize=1000)
def get_data(key):
    return expensive_fetch(key)

Fix 5: Set Memory Requests and Limits Correctly

Kubernetes scheduling depends on requests, but OOMKill depends on limits. Both must be set correctly:

resources:
  requests:
    memory: "128Mi"   # Scheduler uses this to find a node with enough memory
  limits:
    memory: "256Mi"   # Kernel kills the container if it exceeds this

Rules for setting values:

  1. Requests = steady-state memory usage at normal load (measure with kubectl top pods)
  2. Limits = peak memory usage under high load, plus a safety buffer (20-50%)
  3. Never set limits lower than requests (Kubernetes rejects this)
  4. Avoid setting limits equal to requests — any spike causes OOMKill

Use kubectl top to find actual usage:

# Watch memory usage over time
watch kubectl top pods -n <namespace>

# For a specific pod
kubectl top pod <pod-name> -n <namespace> --containers

LimitRange — set default limits for a namespace:

# limitrange.yaml — applies when containers don't specify resources
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
  - default:
      memory: "256Mi"
      cpu: "500m"
    defaultRequest:
      memory: "128Mi"
      cpu: "250m"
    type: Container

Fix 6: Use Vertical Pod Autoscaler (VPA)

If you’re unsure of the right memory values, VPA can recommend or automatically set them based on observed usage:

# vpa.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  updatePolicy:
    updateMode: "Off"    # "Off" = recommend only, don't auto-update
    # updateMode: "Auto" = automatically update pod resources
  resourcePolicy:
    containerPolicies:
    - containerName: api
      minAllowed:
        memory: "64Mi"
      maxAllowed:
        memory: "2Gi"
# Install VPA (if not already installed)
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/download/vertical-pod-autoscaler.yaml

# Apply the VPA
kubectl apply -f vpa.yaml

# Check VPA recommendations
kubectl describe vpa api-vpa -n <namespace>
# Look for: Recommendation > Container Recommendations > Target

Pro Tip: Run VPA in "Off" mode first for a week to collect usage data and get recommendations. Only switch to "Auto" mode once you’ve validated the recommendations match your expectations. Auto mode restarts pods to apply new limits, which can cause brief downtime.

Still Not Working?

Check if it’s the init container being OOMKilled — init containers run before the main container and can also be killed:

kubectl describe pod <pod-name> | grep -A 10 "Init Containers:"

Check node-level OOM events, not just pod events:

# SSH to the node or check system logs
kubectl get events --all-namespaces | grep OOM
journalctl -k | grep -i "oom\|killed process"

For persistent memory growth despite restarts, the issue may be an external resource (Redis, database connection pool) that isn’t cleaned up on restart. Check for connection leaks in your application.

Set terminationMessagePolicy: FallbackToLogsOnError to capture logs from OOMKilled containers:

containers:
- name: api
  terminationMessagePolicy: FallbackToLogsOnError
kubectl describe pod <pod-name> | grep -A 5 "Last State:"
# Termination Message section may contain the last log lines before OOMKill

Check for tmpfs volumes consuming memory. Kubernetes emptyDir volumes with medium: Memory are backed by tmpfs, and their usage counts against the container’s memory limit. If your application writes temporary files to an in-memory emptyDir, those bytes are added to the cgroup’s memory usage. A large log file or temp file written to /dev/shm or a memory-backed emptyDir can push the container over its limit without the application heap growing at all. Verify with df -h inside the container and check for tmpfs mounts.

Check for sidecar containers consuming unexpected memory. Istio/Envoy sidecars, Datadog agents, Fluentd log collectors, and other sidecar containers share the pod’s resource budget. If the main container’s limit is 512Mi but an Envoy sidecar consumes 150Mi, only 362Mi is effectively available for the application. Use kubectl top pods --containers to see per-container memory breakdown and adjust limits for each container individually.

Verify the ResourceQuota isn’t capping pod memory. If the namespace has a ResourceQuota with a limits.memory ceiling, the total memory limits across all pods in the namespace cannot exceed that value. New deployments or scaling events may fail silently if the quota is exhausted, and existing pods may be evicted to make room. Check with kubectl describe resourcequota -n <namespace>.

For related Kubernetes issues, see Fix: Kubernetes CrashLoopBackOff, Fix: Kubernetes Pod Pending, Fix: Docker Exit 137 OOMKilled, and Fix: Java OutOfMemoryError.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles