bookmark_borderMonitor Cluster in Kubernetes

Monitor for what?

  • Node level metrics and Performance metrics
    1. The number of nodes in the cluster.
    2. How many of them are healthy
    3. CPU, Memory, Network, and Disk Utilization.
  • Pod level metrics and Performance metrics
    1. the number of pods,
    2. CPU, Memory Consumption on them.

How do you monitor resource consumption on Kubernetes?

We need a solution that will monitor the metrics store them and provide analytics around this data. Kubernetes does not come with a full-featured built-in monitoring solution.However, there are a number of open-source solutions available, such as Prometheus, Metrics-Server, Elastic Stack, and etc.

Metrics-Server, you can have one metrics server per Kubernetes cluster. The metric server retrieves metrics from each of the Kubernetes nodes and pods, aggregates them, and stores them in memory. The Metrics-Server is only an in-memory monitoring solution and does not store the metrics on the desk and as a result, you can not see historical performance data.

How are the metrics generated for the pods on these nodes?

Kubernetes runs an agent on each node known as the kubelet, which is responsible for receiving instructions from the Kubernetes API Master Server and running pods on the nodes. The kubelet also contains a subcomponent known as cAdvisor. cAdvisor is responsible for retrieving performance metrics from pods and exposing them through the Kubelet API to make the metrics available for the Metrics Server.

Deploy metrics server

https://github.com/kubernetes-sigs/metrics-server

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml

The command deploys a set of pods, services, and roles to enable the metrics server to poll for performance metrics from the nodes in the cluster. Once deployed, give the metrics-server some time to collect and process data. Once processed, cluster performance

$ kubectl top node
Error: metrics not available yet. 
  • if you got this error message, using kubectl, edit the command below:
kubectl -n kube-system edit deployment metrics-server

Add the following two arguments and save the changes:

    spec:
      containers:
      - args:
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP

After the changes gets applied to the cluster, wait for 2-3 mins for the metrics to be fetched.

$ kubectl top node
NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%     
kubemaster   1139m        56%    1037Mi          54%         
kubenode01   41m          2%     817Mi           43%         
$ kubectl top pod -n kube-system
NAME                                 CPU(cores)   MEMORY(bytes)   
coredns-f9fd979d6-wtk5k              4m           9Mi             
coredns-f9fd979d6-x5zxv              4m           15Mi            
etcd-kubemaster                      18m          71Mi            
fluentd-elasticsearch-blvmc          2m           39Mi            
fluentd-elasticsearch-cgn9r          1m           39Mi            
kube-apiserver-kubemaster            57m          295Mi           
kube-controller-manager-kubemaster   20m          67Mi            
kube-proxy-jnf5q                     1m           15Mi            
kube-proxy-zfbsh                     1m           20Mi            
kube-scheduler-kubemaster            4m           29Mi            
metrics-server-8bbfb4bdb-b2m2g       2m           11Mi            
weave-net-g4l7r                      1m           57Mi            
weave-net-xg67h                      2m           43Mi     
ANOTE.DEV