If you want to gracefully shut down a node in the cluster, you must use the drain-node.sh
script.
Click for the drain-node.sh script
#!/bin/bash
# =================
#
#
#
#
# Copyright UiPath 2021
#
# =================
# LICENSE AGREEMENT
# -----------------
# Use of paid UiPath products and services is subject to the licensing agreement
# executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
# UiPath products is subject to the associated licensing agreement available here:
# https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
# You must not use this file separately from the product it is a part of or is associated with.
#
#
#
# =================
fetch_hostname(){
HOST_NAME_NODE=$(kubectl get nodes -o name | cut -d'/' -f2 | grep "$(hostname)")
if ! [[ -n ${HOST_NAME_NODE} && "$(hostname)" == "$HOST_NAME_NODE" ]]; then
for private_ip in $(hostname --all-ip-addresses); do
output=$(kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.addresses[?(@.type=="InternalIP")].address}{"\n"}{end}' | grep "$private_ip")
ip_address=$(echo "$output" | cut -f2 -d$'\t')
if [[ -n ${ip_address} && "$private_ip" == "$ip_address" ]]; then
HOST_NAME_NODE=$(echo "$output" | cut -f1 -d$'\t')
break
fi
done
fi
}
set_kubeconfig(){
export PATH=$PATH:/var/lib/rancher/rke2/bin:/usr/local/bin
[[ -f "/var/lib/rancher/rke2/agent/kubelet.kubeconfig" ]] && export KUBECONFIG="/var/lib/rancher/rke2/agent/kubelet.kubeconfig"
[[ -f "/etc/rancher/rke2/rke2.yaml" ]] && export KUBECONFIG="/etc/rancher/rke2/rke2.yaml"
}
is_kubectl_enabled(){
local try=0
local maxtry=60
local status="notready"
echo "Checking if node $HOST_NAME_NODE is ready to run kubectl command."
while [[ ${status} == "notready" ]] && (( try != maxtry )) ; do
try=$((try+1))
kubectl cluster-info >/dev/null 2>&1 && status="ready"
sleep 5;
done
if [[ ${status} == "notready" ]]; then
echo "Node is not ready to accept kubectl command"
else
echo "Node is ready to accept kubectl command"
fi
}
enable_ipforwarding() {
local file_name="/etc/sysctl.conf"
echo "Enable IP Forwarding..."
if [[ ! -f "${file_name}" || -w "${file_name}" ]]; then
# either file is not available or user doesn't have edit permission
echo "Either file ${file_name} not present or file is not writable. Enabling ip forward using /proc/sys/net/ipv4/ip_forward..."
echo 1 > /proc/sys/net/ipv4/ip_forward
else
echo "File ${file_name} is available and is writable. Checking and enabling ip forward..."
is_ipforwarding_available=$(grep "net.ipv4.ip_forward" "${file_name}") || true
if [[ -z ${is_ipforwarding_available} ]]; then
echo "Adding net.ipv4.ip_forward = 1 in ${file_name}..."
echo "net.ipv4.ip_forward = 1" >> ${file_name}
else
echo "Updating net.ipv4.ip_forward value with 1 in ${file_name}..."
# shellcheck disable=SC2016
sed -i -n -e '/^net.ipv4.ip_forward/!p' -e '$anet.ipv4.ip_forward = 1' ${file_name}
fi
sysctl -p
fi
}
set_kubeconfig
is_kubectl_enabled
fetch_hostname
if [[ -n "$HOST_NAME_NODE" ]]; then
# Pass an argument to uncordon the node. This is to cover reboot scenarios.
if [ "$1" ]; then
# enable ip forward
enable_ipforwarding
# uncordan node
echo "Uncordon $HOST_NAME_NODE ..."
kubectl uncordon "$HOST_NAME_NODE"
else
#If PDB is enabled and they are zero available replicas on other nodes, drain would fail for those pods but thats not the behaviour we want
#Thats when the second command would come to rescue which will ignore the PDB and continue with the eviction of those pods for which eviction failed earlier https://github.com/kubernetes/kubernetes/issues/83307
kubectl drain "$HOST_NAME_NODE" --delete-emptydir-data --ignore-daemonsets --timeout=90s --skip-wait-for-delete-timeout=10 --force --ignore-errors || kubectl drain "$HOST_NAME_NODE" --delete-emptydir-data --ignore-daemonsets --force --disable-eviction=true --timeout=30s --ignore-errors --skip-wait-for-delete-timeout=10 --pod-selector 'app!=csi-attacher,longhorn.io/component!=instance-manager,k8s-app!=kube-dns'
node_mounted_pv=$(kubectl get volumeattachment -o json | jq --arg node "${HOST_NAME_NODE}" -r '.items[] | select(.spec.nodeName==$node) | .metadata.name + ":" + .spec.source.persistentVolumeName')
if [[ -n "${node_mounted_pv}" ]] ; then
while IFS=$'\n' read -r VOL_ATTACHMENT_PV_ID
do
PV_ID=$(echo "${VOL_ATTACHMENT_PV_ID}" | cut -d':' -f2)
VOL_ATTACHMENT_ID=$(echo "${VOL_ATTACHMENT_PV_ID}" | cut -d':' -f1)
if [[ -n "${PV_ID}" ]] ; then
mounts=$(grep "${PV_ID}" /proc/mounts | awk '{print $2}')
if [[ -n $mounts ]] ; then
echo "Removing dangling mounts for pvc: ${PV_ID}"
{
timeout 20s xargs umount -l <<< "${mounts}"
exitCode="$?"
if [[ $exitCode -eq 0 ]] ; then
echo "Command to remove dangling mounts for pvc ${PV_ID} executed successfully"
echo "Waiting to remove dangling mounts for pvc ${PV_ID}"
if timeout 1m bash -c "while grep -q '${PV_ID}' /proc/mounts ; do sleep 1 ; done" ; then
kubectl delete volumeattachment "${VOL_ATTACHMENT_ID}"
if timeout 2m bash -c "while kubectl get node '${HOST_NAME_NODE}' -o yaml | grep -q '${PV_ID}' ; do sleep 1 ; done" ; then
#shellcheck disable=SC1012
find /var/lib/kubelet -name "${PV_ID}" -print0 | xargs -0 \rm -rf
echo "Removed dangling mounts for pvc: ${PV_ID} successfully"
else
echo "Timeout while waiting to remove node dangling mounts for pvc: ${PV_ID}"
fi
else
echo "Timeout while waiting to remove dangling mounts for pvc: ${PV_ID}"
fi
elif [[ $exitCode -eq 124 ]] ; then
echo "Timeout while executing remove dangling mounts for pvc: ${PV_ID}"
else
echo "Error while executing remove dangling mounts for pvc: ${PV_ID}"
fi
} &
fi
fi
done <<< "${node_mounted_pv}"
wait
fi
fi
else
echo "Not able to fetch hostname"
fi
To gracefully shut down a node, take the following steps:
- Save the
drain-node.sh
script on the node you want to shut down and run it using the following command:
sudo bash drain-node.sh
Note:
Every physical or virtual machine represents a node in Kubernetes. For more details, see Kubernetes documentation.
The
drain-node.sh
script drains all the resources scheduled on the node and moves them to other nodes wherever possible. This operation is necessary to ensure the graceful termination of those workloads before any kind of maintenance activity on the node. Nodes are marked asunschedulable
so that other workloads are not scheduled on this node from that moment.For more information about draining a node, see Kubernetes documentation .
-
Proceed with the VM shutdown. Make sure to follow the shutdown procedure according to your company policies.
For instance, to shut down a machine in Azure, you can use the STOP button on the VM overview page or run thesudo shutdown now
command. -
Restart the node, then run the following command so that the node can start accepting workloads again.
sudo bash drain-node.sh nodestart
Note:
Before the node shutdown, the node was marked as
unschedulable
. The command in Step 3 marks it asschedulable
again, so that Kubernetes can start scheduling workloads.For more details on the commands required for node shutdown, see Kubernetes documentation.
Updated 2 months ago