How to Fix Kubernetes Exit Code 137 (OOMKilled Pod Termination Guide)

How to Fix Kubernetes Exit Code 137 (OOMKilled Pod Termination Guide)

Introduction

If you’ve managed Kubernetes workloads, you’ve probably seen a pod crash with Exit Code 137.

This error usually signals that the pod was OOMKilled, terminated because it exceeded its memory limit. While it looks alarming, the fix is often straightforward once you understand the cause.

In this guide, we’ll explain what Exit Code 137 means, its common causes, and 4 proven fixes to keep your pods healthy and prevent unexpected crashes.

What Does Exit Code 137 Mean?

Exit Code 137 in Kubernetes = Linux process killed with SIGKILL (137).

In practice, it almost always means:

  • The pod consumed more memory than allowed.

  • The kubelet killed it to protect the node.

👉 In short: Your pod ran out of memory (OOMKilled).

Common Causes of Exit Code 137

Low memory limits – Pod spec sets memory cap too low.

  • Application memory leaks – App keeps consuming memory until killed.

  • Node resource pressure – Node doesn’t have enough memory for all pods.

  • Large workloads – Queries, caches, or jobs exceeding pod limits.

Fix OOM at the Source

Fix OOM at the Source

Limits, leaks, or node pressure.

Limits, leaks, or node pressure.

How to Fix Exit Code 137

  1. Increase Pod Memory Limits

Adjust your deployment YAML to allocate more memory.

resources:  requests:    memory: "256Mi"  limits:    memory: "512Mi"
resources:  requests:    memory: "256Mi"  limits:    memory: "512Mi"
resources:  requests:    memory: "256Mi"  limits:    memory: "512Mi"

✅ Best when your app consistently needs more memory.
⚠️ Risk: Node may run out of memory if all pods demand more.

  1. Optimize Application Code

Sometimes the pod doesn’t need more memory — it needs better code.

  • Fix memory leaks in your app.

  • Use tools like pprof, heap dumps, Prometheus to profile memory usage.

  • Review large in-memory caches or unclosed connections.

✅ Best when OOMKilled is due to app inefficiency.
⚠️ Requires developer involvement; slower fix.

3. Use Vertical Pod Autoscaler (VPA)

Automatically adjusts pod resource requests and limits.

kubectl apply -f vpa.yaml
kubectl apply -f vpa.yaml
kubectl apply -f vpa.yaml

✅ Best for unpredictable workloads.
⚠️ Extra resource overhead; may not be ideal for stable apps.

4. Monitor with Metrics & Alerts

Don’t wait for crashes. Use monitoring tools:

  • kubectl top pods → check real-time memory usage.

  • Prometheus + Grafana → visualize trends.

  • Set alerts when pods approach memory limits.

✅ Prevents repeat crashes.
⚠️ Needs observability stack set up.

Troubleshooting Checklist

Run these quick checks before applying fixes:

  • Logs:

kubectl logs <pod-name
kubectl logs <pod-name
kubectl logs <pod-name
  • Pod events:

kubectl describe pod <pod-name
kubectl describe pod <pod-name
kubectl describe pod <pod-name
  • Node usage:

kubectl top nodes
kubectl top nodes
kubectl top nodes
  • Confirm OOMKilled status:

kubectl get pod <pod-name> -o yaml | grep -i oom
kubectl get pod <pod-name> -o yaml | grep -i oom
kubectl get pod <pod-name> -o yaml | grep -i oom

Let VPA Right-Size

Let VPA Right-Size

Auto-tune requests and limits.

Auto-tune requests and limits.

Quick Fix Reference Table

Fix

When to Use

Risk

Increase memory limits

Stable app, just under-provisioned

Node exhaustion

Optimize code

Memory leaks suspected

Slower to implement

VPA

Dynamic workloads

Overhead

Monitoring

Continuous issues

Setup effort

How NudgeBee Helps

Exit Code 137 errors often slow down SRE teams. NudgeBee’s AI-powered SRE Assistant helps you:

  • Detect pods nearing OOMKilled before they crash.

  • Recommend fixes automatically.

  • Reduce MTTR with guided workflows.

👉 Discover how NudgeBee simplifies Kubernetes troubleshooting.

Cut MTTR for Pod Crashes

Cut MTTR for Pod Crashes

Guided workflows, fast recovery.

Guided workflows, fast recovery.

FAQs

What is Exit Code 137 in Kubernetes?
It means your pod was OOMKilled, terminated due to exceeding memory limits.

Is Exit Code 137 always memory-related?
Yes. 137 specifically indicates a memory kill (SIGKILL).

How do I prevent Exit Code 137?
1. Set realistic memory requests/limits.
2. Monitor memory usage with Prometheus.
3. Use VPA for dynamic workloads.

Can restarting a pod fix Exit Code 137?
Sometimes. A restart clears memory, but the underlying issue will return if not fixed. See: 4 ways to restart a Kubernetes pod.