Kubernetes Schedulers have an algorithm that distributes pods across nodes evenly as well as takes into consideration the various conditions we specify through taints and tolerations and node affinity etc.
What if you have a specific application that requires its components to be placed on nodes after performing some additional checks. So you decide to have your own scheduling algorithm to place pods on nodes? So that you can add your own custom conditions and checks in it. Kubernetes is highly extensible. You can write your own Kubernetes scheduler program, package it, and deploy it as the default scheduler or as an additional scheduler in the Kubernetes cluster.
Kubernetes cluster can have multiple schedulers at the same time. When creating a pod or a deployment you can instruct Kubernetes to have the pod scheduled by a specific scheduler.
kube-scheduler.yaml (Default)
$ cat /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
- --port=0
image: k8s.gcr.io/kube-scheduler:v1.19.4
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
Create the second scheduler in the cluster
In order to run your scheduler in a Kubernetes cluster, just create the deployment specified in the config above in a Kubernetes cluster:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: custom-scheduler
name: custom-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
- --port=0
image: k8s.gcr.io/kube-scheduler:v1.19.4
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
- etc/kubernetes# cat scheduler.conf
- should add a new user at
etc/kubernetes/scheduler.conf
$ kubectl create -f custom-scheduler.yaml
pod/custom-scheduler created
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
custom-scheduler 1/1 Running 1 10s
kube-scheduler-kubemaster 1/1 Running 17 21d
Enable leader election
To run multiple-scheduler with leader election enabled, you must do the following:
First, update the following fields in your YAML file:
--leader-elect=true
--lock-object-namespace=<lock-object-namespace>
--lock-object-name=<lock-object-name>
To configure a new pod or a deployment to use the new scheduler.
apiVersion: v1
kind: Pod
metadata:
name: sample-pod
labels:
app: sample
type: nginx-server
spec:
containers:
- name: nginx-container
image: nginx
schedulerName: custom-scheduler
$ kubectl create -f sample-pod.yaml
pod/sample-pod created