Skip to content

Fix: Kubernetes Pod stuck in Pending state

FixDevs ·

Quick Answer

How to fix Kubernetes Pod stuck in Pending state caused by insufficient resources, unschedulable nodes, PVC issues, node selectors, taints, and resource quotas.

The Error

You deploy a Pod and it stays in Pending status indefinitely:

$ kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
my-app-7b4f5c8d9-xk2lm  0/1     Pending   0          10m

Running kubectl describe pod shows events like:

Events:
  Warning  FailedScheduling  default-scheduler  0/3 nodes are available:
  1 node(s) had untolerated taint {node.kubernetes.io/not-ready: },
  2 node(s) didn't match Pod's node affinity/selector,
  3 Insufficient cpu.

Or:

Warning  FailedScheduling  0/5 nodes are available: 5 Insufficient memory.
Warning  FailedScheduling  pod has unbound immediate PersistentVolumeClaims.

The Kubernetes scheduler cannot find a node to place the Pod on. The Pod sits in Pending until the scheduling constraint is resolved.

Why This Happens

A Pod enters Pending when the scheduler cannot find a suitable node. The scheduler checks:

  1. Resource requests. Does any node have enough CPU and memory?
  2. Node selectors and affinity. Does the Pod require specific node labels?
  3. Taints and tolerations. Is the node tainted, and does the Pod tolerate it?
  4. PersistentVolumeClaims. Are the requested volumes available?
  5. Resource quotas. Has the namespace exceeded its quota?
  6. Pod topology constraints. Are there spread constraints that cannot be satisfied?

The describe pod events tell you exactly why scheduling failed. Always check there first.

Fix 1: Fix Insufficient CPU or Memory

The most common cause. No node has enough free resources:

# Check the events
kubectl describe pod my-app-7b4f5c8d9-xk2lm | grep -A 10 Events

# Check node resource usage
kubectl top nodes
kubectl describe nodes | grep -A 5 "Allocated resources"

Reduce resource requests:

# Before — requesting too much
resources:
  requests:
    cpu: "4"
    memory: "8Gi"

# After — right-sized
resources:
  requests:
    cpu: "500m"       # 0.5 CPU cores
    memory: "512Mi"
  limits:
    cpu: "1"
    memory: "1Gi"

Check what is consuming resources on the nodes:

# List all pods sorted by CPU request
kubectl get pods --all-namespaces -o custom-columns=\
"NAMESPACE:.metadata.namespace,NAME:.metadata.name,CPU:.spec.containers[*].resources.requests.cpu,MEM:.spec.containers[*].resources.requests.memory" \
| sort -k3 -h

Scale the cluster (add more nodes):

# EKS
eksctl scale nodegroup --cluster my-cluster --nodes 5 --name my-nodegroup

# GKE
gcloud container clusters resize my-cluster --num-nodes 5

# AKS
az aks scale --resource-group myRG --name myCluster --node-count 5

Pro Tip: Set resource requests to what the Pod actually needs under normal load, and limits to the maximum it should ever use. Overly generous requests waste cluster capacity. Use kubectl top pods to measure actual usage before setting requests.

Fix 2: Fix Node Selector and Affinity Issues

The Pod requires specific node labels that no node has:

# Pod requires nodes labeled with gpu=true
spec:
  nodeSelector:
    gpu: "true"

Check available node labels:

kubectl get nodes --show-labels
# or
kubectl get nodes -L gpu

Fix: Add the label to a node:

kubectl label nodes my-node gpu=true

Fix: Remove or update the node selector:

spec:
  # Remove nodeSelector if not needed
  # Or use nodeAffinity for soft preferences
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          preference:
            matchExpressions:
              - key: gpu
                operator: In
                values: ["true"]

preferredDuringScheduling is a soft constraint — the scheduler prefers matching nodes but will use others if none match. requiredDuringScheduling is hard — the Pod stays Pending until a matching node exists.

Fix 3: Fix Taints and Tolerations

Nodes might have taints that prevent scheduling:

# Check node taints
kubectl describe node my-node | grep Taints
# Taints: node.kubernetes.io/not-ready:NoSchedule

Common taints:

TaintMeaning
node.kubernetes.io/not-readyNode is not healthy
node.kubernetes.io/unreachableNode is unreachable
node.kubernetes.io/disk-pressureNode disk is full
node.kubernetes.io/memory-pressureNode is low on memory
node-role.kubernetes.io/control-planeControl plane node

Add tolerations to the Pod:

spec:
  tolerations:
    - key: "gpu"
      operator: "Equal"
      value: "true"
      effect: "NoSchedule"

Remove a taint from a node:

kubectl taint nodes my-node gpu=true:NoSchedule-
# The trailing - removes the taint

For control plane nodes (single-node clusters):

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

Common Mistake: Running a single-node cluster (like Minikube or kind) and wondering why Pods are Pending. Control plane nodes are tainted by default. Either add tolerations or remove the taint.

Fix 4: Fix PersistentVolumeClaim Issues

If the Pod uses a PVC that is not bound:

Warning  FailedScheduling  pod has unbound immediate PersistentVolumeClaims

Check PVC status:

kubectl get pvc
# NAME        STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS
# my-data     Pending                                       standard

Common causes:

# No PersistentVolume matches the PVC
kubectl get pv

# StorageClass doesn't exist
kubectl get storageclass

# The storage provisioner is not running
kubectl get pods -n kube-system | grep provisioner

Fix: Create a PersistentVolume:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /data/my-pv

Fix: Use a valid StorageClass:

# List available storage classes
kubectl get storageclass

# Update PVC to use an existing storage class
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: standard  # Must match an existing StorageClass
  resources:
    requests:
      storage: 10Gi

For cloud providers, check the volume zone:

# The PVC might request a zone where no nodes exist
kubectl describe pvc my-data

Volumes are often zone-specific. If your node is in us-east-1a but the volume is in us-east-1b, the Pod cannot be scheduled.

Fix 5: Fix ResourceQuota Limits

The namespace might have resource quotas that prevent scheduling:

kubectl describe resourcequota -n my-namespace
# Name:       my-quota
# Resource    Used    Hard
# --------    ----    ----
# cpu         3800m   4000m
# memory      7Gi     8Gi
# pods        19      20

Fix: Increase the quota:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: my-quota
  namespace: my-namespace
spec:
  hard:
    cpu: "8"
    memory: "16Gi"
    pods: "40"

Fix: Reduce resource requests on existing Pods:

# Find Pods using the most resources
kubectl top pods -n my-namespace --sort-by=cpu

Fix: Clean up unused Pods:

# Delete completed jobs
kubectl delete jobs --field-selector status.successful=1 -n my-namespace

# Delete evicted pods
kubectl delete pods --field-selector status.phase=Failed -n my-namespace

Fix 6: Fix LimitRange Issues

A LimitRange might set minimum resource requirements higher than your Pod specifies:

kubectl describe limitrange -n my-namespace
# If LimitRange requires minimum 256Mi memory but your Pod requests 128Mi,
# the Pod will not be admitted
Type        Resource  Min     Max    Default Request  Default Limit
----        --------  ---     ---    ---------------  -------------
Container   cpu       100m    2      200m             500m
Container   memory    256Mi   4Gi    256Mi            512Mi

Fix: Increase your Pod’s resource requests to meet the minimum:

resources:
  requests:
    cpu: "100m"
    memory: "256Mi"  # Must meet the LimitRange minimum

Fix 7: Fix Pod Topology Spread Constraints

Spread constraints can prevent scheduling if there are not enough nodes:

spec:
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: DoNotSchedule  # This makes it a hard constraint
      labelSelector:
        matchLabels:
          app: my-app

If you have 3 nodes and already have 1 Pod on each, a 4th Pod cannot satisfy maxSkew: 1 with DoNotSchedule.

Fix: Use ScheduleAnyway:

whenUnsatisfiable: ScheduleAnyway  # Soft constraint — best effort

Fix: Add more nodes or reduce replicas.

Fix 8: Debug Scheduling with Events and Logs

Check Pod events:

kubectl describe pod <pod-name> | tail -20

Check scheduler logs:

kubectl logs -n kube-system -l component=kube-scheduler --tail=50

Simulate scheduling:

# Dry-run to see where a Pod would be placed
kubectl get nodes -o json | jq '.items[] | {name: .metadata.name, allocatable: .status.allocatable}'

Force-reschedule all Pending Pods:

# Delete and let the Deployment recreate them
kubectl delete pod <pod-name>

Still Not Working?

Check for node cordons. A node might be cordoned (marked as unschedulable):

kubectl get nodes
# NAME      STATUS                     ROLES    AGE
# node-1    Ready,SchedulingDisabled   <none>   30d

# Uncordon the node
kubectl uncordon node-1

Check for PodDisruptionBudgets (PDBs). A PDB might prevent rescheduling during rolling updates.

Check for init containers. If an init container cannot complete (e.g., waiting for a service), the Pod stays in a Pending-like state (Init:0/1).

For Pods that start but immediately crash, see Fix: Kubernetes Pod CrashLoopBackOff. For image pull failures, see Fix: Kubernetes ImagePullBackOff. For OOM kills, see Fix: Kubernetes Pod OOMKilled.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles