- Check all the nodes are healthy.
- Check failed the node.
- Check CPU, Memory, Disk space on the node.
- Check Kubelet Status.
- Check Certificates.
Check all the nodes are healthy.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubemaster Ready control-plane,master 38h v1.20.2
kubenode01 NotReady <none> 37h v1.20.2
kubenode02 Ready <none> 37h v1.20.2
If you are reported as NotReady check details about the nodes using the kubectl describe node
Check failed the node
$ kubectl describe node kubenode01
Name: kubenode01
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=kubenode01
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Fri, 29 Jan 2021 01:22:58 +0000
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: kubenode01
AcquireTime: <unset>
RenewTime: Sat, 30 Jan 2021 15:19:06 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Fri, 29 Jan 2021 08:08:14 +0000 Fri, 29 Jan 2021 08:08:14 +0000 WeaveIsUp Weave pod has set this
MemoryPressure Unknown Sat, 30 Jan 2021 15:17:52 +0000 Sat, 30 Jan 2021 15:19:49 +0000 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Sat, 30 Jan 2021 15:17:52 +0000 Sat, 30 Jan 2021 15:19:49 +0000 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Sat, 30 Jan 2021 15:17:52 +0000 Sat, 30 Jan 2021 15:19:49 +0000 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Sat, 30 Jan 2021 15:17:52 +0000 Sat, 30 Jan 2021 15:19:49 +0000 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 192.168.56.3
Hostname: kubenode01
Capacity:
cpu: 2
ephemeral-storage: 40593612Ki
hugepages-2Mi: 0
memory: 2040788Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 37411072758
hugepages-2Mi: 0
memory: 1938388Ki
pods: 110
System Info:
Machine ID: 63b75d07d8cc40709d065a83e1965f1a
System UUID: E97DD35B-7625-6944-B998-56973029AD53
Boot ID: b72ebe6d-9393-47ba-a7e9-8a22397e346d
Kernel Version: 4.15.0-135-generic
OS Image: Ubuntu 18.04.5 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.11
Kubelet Version: v1.20.2
Kube-Proxy Version: v1.20.2
PodCIDR: 10.244.1.0/24
PodCIDRs: 10.244.1.0/24
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default nginx-pod 0 (0%) 0 (0%) 0 (0%) 0 (0%) 31h
kube-system kube-proxy-xvclq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 37h
kube-system weave-net-7cgwj 100m (5%) 0 (0%) 200Mi (10%) 0 (0%) 37h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 100m (5%) 0 (0%)
memory 200Mi (10%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
Each node has a set of conditions that can point us in a direction as to why a node might have failed. Depending on the status they are either set to true or false or unknown.
Check CPU, Memory, Disk space on the node.
top
top - 15:25:41 up 9:49, 1 user, load average: 0.68, 0.53, 0.44
Tasks: 134 total, 3 running, 88 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.5 us, 1.0 sy, 0.0 ni, 97.0 id, 0.2 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 2040788 total, 108180 free, 780784 used, 1151824 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 1215864 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2994 root 20 0 1098956 361876 69096 S 2.7 17.7 23:05.74 kube-apiserver
843 root 20 0 1960740 104676 63732 S 1.7 5.1 13:06.56 kubelet
980 root 20 0 1566360 93160 45852 S 0.7 4.6 4:26.80 dockerd
2913 root 20 0 10.121g 67016 22440 S 0.7 3.3 5:22.83 etcd
3128 root 20 0 816780 105336 60328 R 0.7 5.2 6:03.55 kube-controller
3068 root 20 0 747620 48912 32364 S 0.3 2.4 1:24.29 kube-scheduler
4073 root 20 0 107700 5156 4324 S 0.3 0.3 0:00.74 containerd-shim
4129 root 20 0 743820 36220 26272 S 0.3 1.8 0:04.31 kube-proxy
5943 root 20 0 747404 37640 29316 S 0.3 1.8 1:07.45 coredns
df -h
Filesystem Size Used Avail Use% Mounted on
udev 984M 0 984M 0% /dev
tmpfs 200M 1.6M 198M 1% /run
/dev/sda1 39G 3.2G 36G 9% /
tmpfs 997M 0 997M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 997M 0 997M 0% /sys/fs/cgroup
vagrant 234G 102G 132G 44% /vagrant
tmpfs 200M 0 200M 0% /run/user/1000
service kubelet status
sh: 0: getcwd() failed: No such file or directory
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Fri 2021-01-29 08:07:10 UTC; 1 day 7h ago
Docs: https://kubernetes.io/docs/home/
Main PID: 843 (kubelet)
Tasks: 19 (limit: 2360)
CGroup: /system.slice/kubelet.service
└─843 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kub
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
Check Certificates
openssl x509 -in /var/lib/kubelet/pki/kubelet.crt -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 2 (0x2)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN = kubenode02-ca@1611883386
Validity
Not Before: Jan 29 00:23:06 2021 GMT
Not After : Jan 29 00:23:06 2022 GMT
Subject: CN = kubenode02@1611883386
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public-Key: (2048 bit)
Modulus:
00:dc:0a:4c:41:e3:2f:f6:09:7e:61:77:ff:60:43:
34:6f:58:c1:7d:3e:51:d3:f5:d2:6e:f0:6f:42:ec:
2b:7e:70:07:18:3e:4c:2f:eb:ec:07:24:5f:27:f8:
3e:d9:12:5b:ca:05:ba:0d:6a:34:91:58:4a:05:e9:
bf:44:a7:7e:56:e0:d9:89:6e:ac:0f:ef:cc:5a:5e:
20:98:07:95:d2:87:82:03:7f:33:8f:df:7a:43:e6:
14:06:b2:25:d0:74:d8:f4:99:ab:26:0a:d3:1c:66:
f7:7a:40:61:17:5f:68:77:9f:ae:98:51:a1:cc:c9:
58:7c:0a:d9:1e:5b:2d:7a:eb:04:ac:ee:49:a8:ab:
03:e6:d1:f0:ea:92:01:7c:55:2e:a9:7f:bd:fa:59:
5c:17:65:4a:e7:fc:44:d8:35:cf:9d:a6:cd:cd:17:
f0:76:97:86:f4:dc:8b:68:c0:c8:d6:da:68:03:b0:
56:db:70:93:dd:97:60:82:29:be:2c:83:1f:55:2e:
a9:78:cc:94:64:32:bb:8e:f5:73:79:0b:99:96:d9:
e6:c6:61:ba:ed:87:80:14:57:51:db:f2:48:fb:1c:
97:0a:5e:67:44:22:24:92:f4:26:5b:f9:00:2b:ce:
08:3c:31:9e:cd:b0:95:d3:14:42:cb:6e:e4:69:b0:
6e:01
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Authority Key Identifier:
keyid:E0:C1:BC:49:11:7D:9E:4D:90:2E:89:E9:79:B7:9E:D9:1A:C3:7E:86
X509v3 Subject Alternative Name:
DNS:kubenode02
Signature Algorithm: sha256WithRSAEncryption
0c:7d:22:f5:0d:5b:25:02:ed:8a:34:44:0d:11:80:d9:7e:47:
1c:e7:d1:60:e2:bb:38:53:bd:23:75:ab:c1:72:72:9e:38:09:
45:f8:ce:d5:52:31:d6:51:44:96:5d:56:09:89:5b:8a:e8:ee:
30:4f:30:ba:6e:fe:06:0d:8c:2e:85:fa:c3:97:42:a0:6d:1d:
98:a4:9d:d6:6d:b8:e1:a5:56:b2:13:19:5d:85:0a:81:49:dd:
bf:ca:3d:fd:34:56:8e:00:0c:7f:30:31:d9:1d:46:76:af:6f:
2d:94:a3:6f:04:bb:3a:aa:5f:d3:7e:b4:b6:86:5a:0a:ea:d8:
9c:4d:e8:7e:97:10:e9:8b:9e:4d:fb:5b:32:26:fa:f0:05:ae:
a8:d7:34:e2:3e:f8:83:7e:df:e8:dc:c5:f7:f9:81:26:4a:ed:
3e:41:80:20:68:ce:76:16:6f:89:82:e2:42:44:c3:0e:43:dd:
02:8d:e5:11:94:3b:71:63:5b:72:a3:63:3f:b6:1f:d5:f0:d6:
b8:81:1d:32:cf:92:91:71:20:44:d3:70:1e:d3:c9:a7:60:72:
4a:9f:2d:be:64:77:f2:47:1c:d3:0e:ed:04:07:f6:37:b1:69:
d7:70:8f:2f:f2:ff:c7:92:11:9c:41:79:4d:fd:ec:43:17:3d:
00:e8:27:b7
-----BEGIN CERTIFICATE-----
MIIDKzCCAhOgAwIBAgIBAjANBgkqhkiG9w0BAQsFADAjMSEwHwYDVQQDDBhrdWJl
bm9kZTAyLWNhQDE2MTE4ODMzODYwHhcNMjEwMTI5MDAyMzA2WhcNMjIwMTI5MDAy
MzA2WjAgMR4wHAYDVQQDDBVrdWJlbm9kZTAyQDE2MTE4ODMzODYwggEiMA0GCSqG
SIb3DQEBAQUAA4IBDwAwggEKAoIBAQDcCkxB4y/2CX5hd/9gQzRvWMF9PlHT9dJu
8G9C7Ct+cAcYPkwv6+wHJF8n+D7ZElvKBboNajSRWEoF6b9Ep35W4NmJbqwP78xa
XiCYB5XSh4IDfzOP33pD5hQGsiXQdNj0masmCtMcZvd6QGEXX2h3n66YUaHMyVh8
CtkeWy166wSs7kmoqwPm0fDqkgF8VS6pf736WVwXZUrn/ETYNc+dps3NF/B2l4b0
3ItowMjW2mgDsFbbcJPdl2CCKb4sgx9VLql4zJRkMruO9XN5C5mW2ebGYbrth4AU
V1Hb8kj7HJcKXmdEIiSS9CZb+QArzgg8MZ7NsJXTFELLbuRpsG4BAgMBAAGjbTBr
MA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcDATAMBgNVHRMBAf8E
AjAAMB8GA1UdIwQYMBaAFODBvEkRfZ5NkC6J6Xm3ntkaw36GMBUGA1UdEQQOMAyC
Cmt1YmVub2RlMDIwDQYJKoZIhvcNAQELBQADggEBAAx9IvUNWyUC7Yo0RA0RgNl+
Rxzn0WDiuzhTvSN1q8Fycp44CUX4ztVSMdZRRJZdVgmJW4ro7jBPMLpu/gYNjC6F
+sOXQqBtHZikndZtuOGlVrITGV2FCoFJ3b/KPf00Vo4ADH8wMdkdRnavby2Uo28E
uzqqX9N+tLaGWgrq2JxN6H6XEOmLnk37WzIm+vAFrqjXNOI++IN+3+jcxff5gSZK
7T5BgCBoznYWb4mC4kJEww5D3QKN5RGUO3FjW3KjYz+2H9Xw1riBHTLPkpFxIETT
cB7TyadgckqfLb5kd/JHHNMO7QQH9jexaddwjy/y/8eSEZxBeU397EMXPQDoJ7c=
-----END CERTIFICATE-----