Kubernetes Node Not Ready? Here’s How to Fix It Fast

Kubernetes Node Not Ready? Here’s How to Fix It Fast

Your Kubernetes node suddenly became NotReady?

You run: kubectl get nodes

and see: worker-node-1 NotReady

At this point:

  • pods may stop scheduling
  • workloads can fail
  • cluster health starts degrading

In most cases, the issue is related to:

  • kubelet failure
  • resource exhaustion
  • networking issues
  • container runtime problems

This guide walks through the fastest way to diagnose and fix it.

First: Check Why the Node Became NotReady

Run: kubectl describe node <node-name>

Now look under: Conditions

This section usually tells you the real issue.

Scenario 1: Kubelet Stopped Running

This is one of the most common causes.

Check kubelet: systemctl status kubelet

Wait 30–60 seconds and check: kubectl get nodes

If node becomes Ready again → issue solved.

Scenario 2: Disk Space Is Full

A node with almost no storage frequently becomes NotReady.

Check storage: df -h

Look for:

  • /var
  • /
  • container storage paths

If usage is near 100%:

  • remove old logs
  • clean unused images
  • prune containers

Example: docker system prune -a

or for containerd environments: clean unused snapshots/images

Scenario 3: Container Runtime Failure

If Docker or containerd crashes:

  • Kubernetes cannot manage pods
  • node health fails

Check runtime: systemctl status containerd or systemctl status docker

Restart if needed.

Scenario 4: Kubernetes Networking Broke

A broken CNI plugin can isolate the node.

Check kube-system pods: kubectl get pods -n kube-system

Look for failures in:

  • Calico
  • Flannel
  • Cilium

Especially: CrashLoopBackOff

This usually indicates networking issues.

Scenario 5: Node Ran Out of Memory

High memory pressure can push the node into unhealthy state.

Check: free -m and top

If memory usage is extremely high:

  • identify memory-heavy pods
  • increase node size
  • optimize workloads

Fastest Recovery Path (Most Engineers Do This)

If production is impacted badly:

  • Restart kubelet
  • Restart container runtime
  • Check disk space
  • Reboot node if required

In many real-world cases, this restores the node quickly.

The Mistake Most Teams Make

Most teams only restart the node.

But if the real issue is:

  • disk pressure
  • memory exhaustion
  • broken networking

…the problem comes back again.

You need to identify the actual root cause.

How Teams Prevent “Node Not Ready” Incidents

Modern SRE teams use:

  • node health monitoring
  • automated alerts
  • resource pressure detection
  • AI-based root cause analysis

This helps detect failures before workloads go down.

Tools That Help Detect Node Failures Early

Nudgebee

Useful for:

  • Kubernetes health monitoring
  • automated diagnostics
  • MTTR reduction
  • infrastructure incident workflows

Prometheus + Grafana

Good for:

  • node metrics
  • resource monitoring

Datadog

Useful for:

  • Kubernetes infrastructure visibility

Quick Command Cheat Sheet

kubectl get nodes
kubectl describe node <node-name>
systemctl status kubelet
systemctl status containerd
df -h
kubectl get pods -n kube-system

FAQs

Why does a Kubernetes node become NotReady?

A node usually becomes NotReady because of kubelet failures, disk pressure, memory exhaustion, networking issues, or container runtime problems.

How do I check node health in Kubernetes?

Run:

kubectl describe node <node-name>

Check the Conditions section for:

  • MemoryPressure
  • DiskPressure
  • NetworkUnavailable

How do I fix a Kubernetes Node Not Ready error?

Common fixes include:

  • restarting kubelet
  • checking disk space
  • restarting container runtime
  • verifying network plugins
  • rebooting the node if needed

Can disk pressure cause Kubernetes Node Not Ready?

Yes. If the node runs out of storage, kubelet and containers may stop functioning properly, causing the node to enter the NotReady state.

Can networking issues cause Node Not Ready?

Yes. Problems with CNI plugins like:

  • Calico
  • Flannel
  • Cilium

can prevent the node from communicating with the cluster.

What happens when a node becomes NotReady?

When a node becomes NotReady:

  • new pods may not schedule
  • existing workloads may fail
  • applications can become unavailable