Automation Suite
2022.10
False
Image de fond de la bannière
Guide d'installation d'Automation Suite
Dernière mise à jour 24 avr. 2024

La sauvegarde a échoué en raison de l’erreur TropInstantanés (TooManySnapshots)

Description

Lors de la sauvegarde, les volumes Longhorn sont sauvegardés en prenant l’instantané du volume et en l’envoyant vers un emplacement distant. Si des problèmes de création d’instantanés affectent le volume Longhorn, avec un nombre d’instantanés pour le volume supérieur à 248, la sauvegarde ne réussira pas.

Vous pouvez vérifier si la sauvegarde a échoué en raison de l’erreur TooManySnapshots des manières suivantes :
  • En vérifiant les journaux Velero :

    kubectl logs  -n velero -l app.kubernetes.io/name=velero -c velero |grep "Waiting for volumesnapshotcontents" kubectl logs  -n velero -l app.kubernetes.io/name=velero -c velero |grep "Waiting for volumesnapshotcontents"

    Exemple de sortie :

    time="2023-12-15T08:15:59Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-f073b88e-bd01-40a8-a645-ec929e276cef to have snapshot handle. Retrying in 5s" backup=velero/daily-2 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:182" pluginName=velero-plugin-for-csi
    time="2023-12-15T08:15:59Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-f073b88e-bd01-40a8-a645-ec929e276cef to have snapshot handle. Retrying in 5s" backup=velero/daily-2 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:182" pluginName=velero-plugin-for-csitime="2023-12-15T08:15:59Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-f073b88e-bd01-40a8-a645-ec929e276cef to have snapshot handle. Retrying in 5s" backup=velero/daily-2 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:182" pluginName=velero-plugin-for-csi
    time="2023-12-15T08:15:59Z" level=info msg="Waiting for volumesnapshotcontents snapcontent-f073b88e-bd01-40a8-a645-ec929e276cef to have snapshot handle. Retrying in 5s" backup=velero/daily-2 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:182" pluginName=velero-plugin-for-csi
  • En vérifiant les journaux du pod Longhorn :

    kubectl logs  -n longhorn-system -l app=csi-snapshotter --tail=-1 |grep "too many snapshots created"kubectl logs  -n longhorn-system -l app=csi-snapshotter --tail=-1 |grep "too many snapshots created"

    Exemple de sortie :

    I1215 08:39:41.707351       1 snapshot_controller.go:291] createSnapshotWrapper: CreateSnapshot for content snapcontent-f073b88e-bd01-40a8-a645-ec929e276cef returned error: rpc error: code = Internal desc = Bad response statusCode [500]. Status [500 Internal Server Error]. Body: [code=Server Error, detail=, message=failed to create snapshot: proxyServer=10.42.7.56:8501 destination=10.42.7.56:10004: failed to snapshot volume: rpc error: code = Unknown desc = failed to create snapshot snapshot-f073b88e-bd01-40a8-a645-ec929e276cef for volume 10.42.7.56:10004: rpc error: code = Unknown desc = too many snapshots created] from [http://longhorn-backend:9500/v1/volumes/pvc-7d89efa4-3d60-4837-a632-f190cd3cd9ed?action=snapshotCreate]I1215 08:39:41.707351       1 snapshot_controller.go:291] createSnapshotWrapper: CreateSnapshot for content snapcontent-f073b88e-bd01-40a8-a645-ec929e276cef returned error: rpc error: code = Internal desc = Bad response statusCode [500]. Status [500 Internal Server Error]. Body: [code=Server Error, detail=, message=failed to create snapshot: proxyServer=10.42.7.56:8501 destination=10.42.7.56:10004: failed to snapshot volume: rpc error: code = Unknown desc = failed to create snapshot snapshot-f073b88e-bd01-40a8-a645-ec929e276cef for volume 10.42.7.56:10004: rpc error: code = Unknown desc = too many snapshots created] from [http://longhorn-backend:9500/v1/volumes/pvc-7d89efa4-3d60-4837-a632-f190cd3cd9ed?action=snapshotCreate]

Solution

Si vous rencontrez l’erreur TooManySnapshots, vous devez nettoyer les instantanés pour tous les volumes en exécutant le script suivant. Pour plus de détails sur l’automatisation de cette opération, consultez la section Comment nettoyer automatiquement les instantanés Longhorn.
#!/bin/bash
set -e

# longhorn backend URL
url=
# By default, snapshot older than 10 days will be deleted
days=10

function display_usage() {
	echo "usage: $(basename "$0") [-h] -u longhorn-url -d days"
	echo "  -u	Longhorn URL"
	echo "  -d 	Number of days(should be >0). By default, script will delete snapshot older than 10 days."
	echo "  -h	Print help"
}

while getopts 'hd:u:' flag "$@"; do
	case "${flag}" in
		u)
			url=${OPTARG}
			;;
		d)
			days=${OPTARG}
			[ "$days" ] && [ -z "${days//[0-9]}" ] || { echo "Invalid number of days=$days"; exit 1; }
			;;
		h)
			display_usage
			exit 0
			;;
		:)
			echo "Invalid option: ${OPTARG} requires an argument."
			exit 1
			;;
		*)
			echo "Unexpected option ${flag}"
			exit 1
			;;
	esac
done

[[ -z "$url" ]] && echo "Missing longhorn URL" && exit 1

# check if URL is valid
curl -s --connect-timeout 30 ${url}/v1 >> /dev/null || { echo "Unable to connect to longhorn backend"; exit 1; }

echo "Deleting snapshots older than $days days"

# Fetch list of longhorn volumes
vols=$( (curl -s -X GET ${url}/v1/volumes |jq -r '.data[].name') )

#delete given snapshot for given volume
function delete_snapshot() {
	local vol=$1
	local snap=$2

	[[ -z "$vol" || -z "$snap" ]] && echo "Error: delete_snapshot: Empty argument" && return 1
	curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotDelete -d '{"name": "'$snap'"}'
	echo "Snapshot=$snap deleted for volume=$vol"
}

#perform cleanup for given volume
function cleanup_volume() {
	local vol=$1
	local deleted_snap=0

	[[ -z "$vol" ]] && echo "Error: cleanup_volume: Empty argument" && return 1

	# fetch list of snapshot
	snaps=$( (curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotList | jq  -r '.data[] | select(.usercreated==true) | .name' ) )
	for i in ${snaps[@]}; do
		if [[ $i == "volume-head" ]]; then
			continue
		fi

		# calculate date difference for snapshot
		snapTime=$(curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotGet -d '{"name":"'$i'"}' |jq -r '.created')
		currentTime=$(date "+%s")
		timeDiff=$(($currentTime - ($(date -d $snapTime "+%s")) / 86400))
		if [[ $timeDiff -lt $days ]]; then
			echo "Ignoring snapshot $i, since it is older than $timeDiff days"
			continue
		fi

		#trigger deletion for snapshot
		delete_snapshot $vol $i
		deleted_snap=$((deleted_snap+1))
	done

	if [[ "$deleted_snap" -gt 0 ]]; then
		#trigger purge for volume
		curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotPurge >> /dev/null
	fi

}

for i in ${vols[@]}; do
	cleanup_volume $i
done#!/bin/bash
set -e

# longhorn backend URL
url=
# By default, snapshot older than 10 days will be deleted
days=10

function display_usage() {
	echo "usage: $(basename "$0") [-h] -u longhorn-url -d days"
	echo "  -u	Longhorn URL"
	echo "  -d 	Number of days(should be >0). By default, script will delete snapshot older than 10 days."
	echo "  -h	Print help"
}

while getopts 'hd:u:' flag "$@"; do
	case "${flag}" in
		u)
			url=${OPTARG}
			;;
		d)
			days=${OPTARG}
			[ "$days" ] && [ -z "${days//[0-9]}" ] || { echo "Invalid number of days=$days"; exit 1; }
			;;
		h)
			display_usage
			exit 0
			;;
		:)
			echo "Invalid option: ${OPTARG} requires an argument."
			exit 1
			;;
		*)
			echo "Unexpected option ${flag}"
			exit 1
			;;
	esac
done

[[ -z "$url" ]] && echo "Missing longhorn URL" && exit 1

# check if URL is valid
curl -s --connect-timeout 30 ${url}/v1 >> /dev/null || { echo "Unable to connect to longhorn backend"; exit 1; }

echo "Deleting snapshots older than $days days"

# Fetch list of longhorn volumes
vols=$( (curl -s -X GET ${url}/v1/volumes |jq -r '.data[].name') )

#delete given snapshot for given volume
function delete_snapshot() {
	local vol=$1
	local snap=$2

	[[ -z "$vol" || -z "$snap" ]] && echo "Error: delete_snapshot: Empty argument" && return 1
	curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotDelete -d '{"name": "'$snap'"}'
	echo "Snapshot=$snap deleted for volume=$vol"
}

#perform cleanup for given volume
function cleanup_volume() {
	local vol=$1
	local deleted_snap=0

	[[ -z "$vol" ]] && echo "Error: cleanup_volume: Empty argument" && return 1

	# fetch list of snapshot
	snaps=$( (curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotList | jq  -r '.data[] | select(.usercreated==true) | .name' ) )
	for i in ${snaps[@]}; do
		if [[ $i == "volume-head" ]]; then
			continue
		fi

		# calculate date difference for snapshot
		snapTime=$(curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotGet -d '{"name":"'$i'"}' |jq -r '.created')
		currentTime=$(date "+%s")
		timeDiff=$(($currentTime - ($(date -d $snapTime "+%s")) / 86400))
		if [[ $timeDiff -lt $days ]]; then
			echo "Ignoring snapshot $i, since it is older than $timeDiff days"
			continue
		fi

		#trigger deletion for snapshot
		delete_snapshot $vol $i
		deleted_snap=$((deleted_snap+1))
	done

	if [[ "$deleted_snap" -gt 0 ]]; then
		#trigger purge for volume
		curl -s -X POST ${url}/v1/volumes/${vol}?action=snapshotPurge >> /dev/null
	fi

}

for i in ${vols[@]}; do
	cleanup_volume $i
done

Lors de l’exécution du script susmentionné, vous devez transmettre les arguments suivants :

  • -u : l’URL principale de Longhorn. Pour obtenir l’URL principale de Longhorn, exécutez la commande suivante :
    kubectl get svc -n longhorn-system longhorn-backend -o json | jq -r '.spec | (.clusterIP|tostring) + ":" + (.ports[0].port|tostring)' to fetch URLkubectl get svc -n longhorn-system longhorn-backend -o json | jq -r '.spec | (.clusterIP|tostring) + ":" + (.ports[0].port|tostring)' to fetch URL
  • -d : le nombre de jours avant la suppression des instantanés.
Remarque :
Pour obtenir une liste des volumes concernés par l’erreur TooManySnapshots, exécutez la commande suivante :
kubectl get volumes -n longhorn-system -o json | jq -r '.items[] | select(([ .status.conditions[] | select(.type == "toomanysnapshots" and .status == "True") ] | length ) == 1 ) | .metadata.name'kubectl get volumes -n longhorn-system -o json | jq -r '.items[] | select(([ .status.conditions[] | select(.type == "toomanysnapshots" and .status == "True") ] | length ) == 1 ) | .metadata.name'
  • Description
  • Solution

Cette page vous a-t-elle été utile ?

Obtenez l'aide dont vous avez besoin
Formation RPA - Cours d'automatisation
Forum de la communauté UiPath
Logo Uipath blanc
Confiance et sécurité
© 2005-2024 UiPath. All rights reserved.