automation-suite

2023.10

false

Importante :

La localización de contenidos recién publicados puede tardar entre una y dos semanas en estar disponible.

Guía de instalación de Automation Suite en Linux

Última actualización 13 de nov. de 2025

Solución de problemas de almacenamiento

Si tienes problemas relacionados con el almacenamiento, consulta lo siguiente:

Error al compactar las métricas debido a bloques corruptos en Thanos

Descripción

El compactador de Thanos puede fallar al compactar las métricas cuando se detectan bloques corruptos en el almacén de objetos. Esta condición evita que el compactador procese las métricas, lo que conduce a un mayor uso del almacenamiento en el depósito de Ceph.

Solución

Para solucionar el problema, sigue los siguientes pasos:

En cualquier nodo del servidor, ejecuta el siguiente script:

thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f -
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
  labels:
    app.kubernetes.io/component: thanos-cleaner
    app.kubernetes.io/instance: thanos-block-cleaner
    app.kubernetes.io/name: thanos-block-cleaner
  name: thanos-cleaner-role
  namespace: ${thanosns}
rules:
- apiGroups:
  - apps
  resources:
  - statefulsets
  - statefulsets/scale
  verbs:
  - list
  - get
  - update
  - patch
- apiGroups:
  - batch
  resources:
  - jobs
  - cronjobs
  verbs:
  - delete
  - list
  - get
  - update
  - create
  - watch
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - delete
  - list
  - get
  - update
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    app.kubernetes.io/component: thanos-cleaner
    app.kubernetes.io/instance: thanos-block-cleaner
    app.kubernetes.io/name: thanos-block-cleaner
  name: thanos-cleaner-role-binding
  namespace: ${thanosns}
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: thanos-cleaner-role
subjects:
- kind: ServiceAccount
  name: thanos-cleaner
  namespace: ${thanosns}
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: thanos-cleaner
  namespace: ${thanosns}
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: thanos-cleaner
  namespace: uipath
spec:
  groups:
  - name: thanos
    rules:
    - alert: ThanosCompactorNotWorking
      annotations:
        description: Thanos compactor is not working. This will disable metrics compaction
          in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace
          for any error. Compactor in faulty state will exhaust object store space
        message: Thanos compactor is not working. Please check if thanos cleaner job
          is functional and able to fix corruption
        runbook_url: https://docs.uipath.com/automation-suite/docs/alert-runbooks
        summary: Thanos compactor is not working
      expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1
      for: 1d
      labels:
        app: thanos
        severity: critical
---
EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f -
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
  labels:
    app.kubernetes.io/component: thanos-cleaner
    app.kubernetes.io/instance: thanos-block-cleaner
    app.kubernetes.io/name: thanos-block-cleaner
  name: thanos-cleaner-role
  namespace: ${thanosns}
rules:
- apiGroups:
  - apps
  resources:
  - statefulsets
  - statefulsets/scale
  verbs:
  - list
  - get
  - update
  - patch
- apiGroups:
  - batch
  resources:
  - jobs
  - cronjobs
  verbs:
  - delete
  - list
  - get
  - update
  - create
  - watch
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - delete
  - list
  - get
  - update
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    app.kubernetes.io/component: thanos-cleaner
    app.kubernetes.io/instance: thanos-block-cleaner
    app.kubernetes.io/name: thanos-block-cleaner
  name: thanos-cleaner-role-binding
  namespace: ${thanosns}
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: thanos-cleaner-role
subjects:
- kind: ServiceAccount
  name: thanos-cleaner
  namespace: ${thanosns}
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: thanos-cleaner
  namespace: ${thanosns}
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: thanos-cleaner
  namespace: uipath
spec:
  groups:
  - name: thanos
    rules:
    - alert: ThanosCompactorNotWorking
      annotations:
        description: Thanos compactor is not working. This will disable metrics compaction
          in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace
          for any error. Compactor in faulty state will exhaust object store space
        message: Thanos compactor is not working. Please check if thanos cleaner job
          is functional and able to fix corruption
        runbook_url: https://docs.uipath.com/es/automation-suite/docs/alert-runbooks
        summary: Thanos compactor is not working
      expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1
      for: 1d
      labels:
        app: thanos
        severity: critical
---
EOF

En cualquier nodo del servidor, ejecuta el siguiente script:

cat <<'EOF' | kubectl apply -f -
---
apiVersion: v1
data:
  thanos-cleanup.sh: |
    #!/bin/bash

    # Copyright UiPath 2021
    #
    # =================
    # LICENSE AGREEMENT
    # -----------------
    #   Use of paid UiPath products and services is subject to the licensing agreement
    #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
    #   UiPath products is subject to the associated licensing agreement available here:
    #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
    #   You must not use this file separately from the product it is a part of or is associated with.

    set -eu -o pipefail

    export PATH=$PATH:/thanos-bin/
    # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction
    #
    # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module.
    # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod.

    config_file=/etc/thanos/${THANOS_CONFIG_KEY}

    function info() {
      echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
    }

    function warn() {
      echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error_without_exit() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
      exit 1
    }

    function is_compaction_halted() {
      info "Checking if thanos compactor running"

      IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")"

      is_compactor_halted=0

      if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then
        info "Thanos compactor pod is not running"
        is_compactor_halted=1
      fi

      for ip in "${compactor_addresses[@]}"; do
        #shellcheck disable=SC2086
        halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}')
        if [[ "$halted" -eq "1" ]]; then
          warn "Compaction is halted"
          is_compactor_halted=1
          break
        fi
      done

      return $is_compactor_halted
    }

    function execute_thanos_issue_command() {
      if [[ $# -ne 1 ]]; then
        error "missing issue name for execute_thanos_issue_command function"
      fi

      issue=$1

      info "Checking for issue $issue"
      cmd_ret=0
      #shellcheck disable=SC2086
      verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1
      if [[ $cmd_ret -eq 1 ]]; then
        error_without_exit "Output of $issue command: -> $verify_output"
        error "Failed to verify bucket for $issue"
      fi

      #shellcheck disable=SC2086
      echo $verify_output
    }

    function fix_index_issue() {
      info "Fixing index_known_issue issue"

      verify_output=$(execute_thanos_issue_command "index_known_issues")
      #shellcheck disable=SC2086
      for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do
        info "Block=$b is having the issue, removing it.."

        thanos tools bucket mark --id="$b" \
          --marker=deletion-mark.json \
          --details="deleted by job" \
          --objstore.config-file="${config_file}"

        info "Block=$b is marked for deletion"
      done

      info "Fixing index_known_issue issue done"
    }

    function fix_overlapping_issue() {
      info "Fixing overlapped_blocks issue"

      overlap_output=$(execute_thanos_issue_command "overlapped_blocks")

      while IFS= read -r line; do
        #shellcheck disable=SC2086
        for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do
          info "Block=$b is having the issue, removing it.."
          thanos tools bucket mark --id="$b" \
            --marker=deletion-mark.json \
            --details="deleted by job" \
            --objstore.config-file="${config_file}"

          info "Block=$b is marked for deletion"
        done
      done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks")

      info "Fixing overlapped_blocks issue done"
    }

    function fix_duplicate_issue() {
      info "Fixing duplicated_compaction issue"
      duplicate_output=$(execute_thanos_issue_command "duplicated_compaction")
      #shellcheck disable=SC2086,SC2006
      for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do
        info "Block=$b is having the issue, removing it.."
        thanos tools bucket mark --id="$b" \
          --marker=deletion-mark.json \
          --details="deleted by job" \
          --objstore.config-file="${config_file}"

        info "Block=$b is marked for deletion"
      done

      info "Fixing duplicated_compaction issue done"
    }

    if [[ -z "$NAMESPACE" ]]; then
      error "NAMESPACE is not set"
    fi

    # We will check if compaction is halted or not before checking for issues
    if is_compaction_halted; then
      info "Thanos compaction is working"
      echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
      exit 0
    fi

    warn "Thanos compactor is not working. Checking for corrupted blocks..."
    echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"

    if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then
      info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean"
      exit 0
    fi

    info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks"

    replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}')

    # compactor must not be running while deleting blocks

    info "Stopping compactor"
    kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0
    kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force

    # fixing index_known_issues
    info "Checking blocks having issue"

    fix_index_issue
    fix_overlapping_issue
    fix_duplicate_issue

    info "Triggering deletion of all marked blocks"

    #shellcheck disable=SC2086
    thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file}

    info "Corrupted blocks are deleted"

    info "Scaling thanos compactor's replica to $replica"
    #shellcheck disable=SC2086
    kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica
    info "Thanos compactor started"
  validate-cronjob.sh: |
    #!/bin/bash

    # Copyright UiPath 2021
    #
    # =================
    # LICENSE AGREEMENT
    # -----------------
    #   Use of paid UiPath products and services is subject to the licensing agreement
    #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
    #   UiPath products is subject to the associated licensing agreement available here:
    #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
    #   You must not use this file separately from the product it is a part of or is associated with.

    set -eu -o pipefail

    function info() {
      echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
    }

    function warn() {
      echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error_without_exit() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
      exit 1
    }

    alias kubectl='kubectl --cache-dir=/tmp/'
    IFS="," read -ra cronjobs <<<"$CRONJOB_LIST"

    for cr in "${cronjobs[@]}"; do
      #shellcheck disable=SC2206
      name=(${cr//// })
      cronNs=default
      cronName=""

      if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then
        error "Invalid cronjob name=$cr"
      fi

      if [[ ${#name[@]} -eq 2 ]]; then
        cronNs=${name[0]}
        cronName=${name[1]}
      else
        cronName=${name[0]}
      fi

      info "Validating cronjob=$cr"

      jobName="${cronName}-sf-job-validation"

      created=1
      info "Creating validation job for $cr"
      kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m

      #shellcheck disable=SC2086
      kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0

      if [[ $created == 0 ]]; then
        error "Failed to create job for $cr"
      fi

      #shellcheck disable=SC2086
      kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName &
      cpid=$!

      #shellcheck disable=SC2086
      kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 &
      fpid=$!

      ret=0
      wait -n $cpid $fpid || ret=1

      kill -9 $cpid || true
      kill -9 $fpid || true

      if [[ $ret -eq 0 ]]; then
        info "Job for $cr is validated/completed"
        #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation
        kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true
      else
        error "Job for $cr failed"
      fi
    done
kind: ConfigMap
metadata:
  name: thanos-cleaner-script
  namespace: cattle-monitoring-system
---
EOFcat <<'EOF' | kubectl apply -f -
---
apiVersion: v1
data:
  thanos-cleanup.sh: |
    #!/bin/bash

    # Copyright UiPath 2021
    #
    # =================
    # LICENSE AGREEMENT
    # -----------------
    #   Use of paid UiPath products and services is subject to the licensing agreement
    #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
    #   UiPath products is subject to the associated licensing agreement available here:
    #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
    #   You must not use this file separately from the product it is a part of or is associated with.

    set -eu -o pipefail

    export PATH=$PATH:/thanos-bin/
    # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction
    #
    # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module.
    # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod.

    config_file=/etc/thanos/${THANOS_CONFIG_KEY}

    function info() {
      echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
    }

    function warn() {
      echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error_without_exit() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
      exit 1
    }

    function is_compaction_halted() {
      info "Checking if thanos compactor running"

      IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")"

      is_compactor_halted=0

      if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then
        info "Thanos compactor pod is not running"
        is_compactor_halted=1
      fi

      for ip in "${compactor_addresses[@]}"; do
        #shellcheck disable=SC2086
        halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}')
        if [[ "$halted" -eq "1" ]]; then
          warn "Compaction is halted"
          is_compactor_halted=1
          break
        fi
      done

      return $is_compactor_halted
    }

    function execute_thanos_issue_command() {
      if [[ $# -ne 1 ]]; then
        error "missing issue name for execute_thanos_issue_command function"
      fi

      issue=$1

      info "Checking for issue $issue"
      cmd_ret=0
      #shellcheck disable=SC2086
      verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1
      if [[ $cmd_ret -eq 1 ]]; then
        error_without_exit "Output of $issue command: -> $verify_output"
        error "Failed to verify bucket for $issue"
      fi

      #shellcheck disable=SC2086
      echo $verify_output
    }

    function fix_index_issue() {
      info "Fixing index_known_issue issue"

      verify_output=$(execute_thanos_issue_command "index_known_issues")
      #shellcheck disable=SC2086
      for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do
        info "Block=$b is having the issue, removing it.."

        thanos tools bucket mark --id="$b" \
          --marker=deletion-mark.json \
          --details="deleted by job" \
          --objstore.config-file="${config_file}"

        info "Block=$b is marked for deletion"
      done

      info "Fixing index_known_issue issue done"
    }

    function fix_overlapping_issue() {
      info "Fixing overlapped_blocks issue"

      overlap_output=$(execute_thanos_issue_command "overlapped_blocks")

      while IFS= read -r line; do
        #shellcheck disable=SC2086
        for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do
          info "Block=$b is having the issue, removing it.."
          thanos tools bucket mark --id="$b" \
            --marker=deletion-mark.json \
            --details="deleted by job" \
            --objstore.config-file="${config_file}"

          info "Block=$b is marked for deletion"
        done
      done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks")

      info "Fixing overlapped_blocks issue done"
    }

    function fix_duplicate_issue() {
      info "Fixing duplicated_compaction issue"
      duplicate_output=$(execute_thanos_issue_command "duplicated_compaction")
      #shellcheck disable=SC2086,SC2006
      for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do
        info "Block=$b is having the issue, removing it.."
        thanos tools bucket mark --id="$b" \
          --marker=deletion-mark.json \
          --details="deleted by job" \
          --objstore.config-file="${config_file}"

        info "Block=$b is marked for deletion"
      done

      info "Fixing duplicated_compaction issue done"
    }

    if [[ -z "$NAMESPACE" ]]; then
      error "NAMESPACE is not set"
    fi

    # We will check if compaction is halted or not before checking for issues
    if is_compaction_halted; then
      info "Thanos compaction is working"
      echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
      exit 0
    fi

    warn "Thanos compactor is not working. Checking for corrupted blocks..."
    echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"

    if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then
      info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean"
      exit 0
    fi

    info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks"

    replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}')

    # compactor must not be running while deleting blocks

    info "Stopping compactor"
    kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0
    kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force

    # fixing index_known_issues
    info "Checking blocks having issue"

    fix_index_issue
    fix_overlapping_issue
    fix_duplicate_issue

    info "Triggering deletion of all marked blocks"

    #shellcheck disable=SC2086
    thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file}

    info "Corrupted blocks are deleted"

    info "Scaling thanos compactor's replica to $replica"
    #shellcheck disable=SC2086
    kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica
    info "Thanos compactor started"
  validate-cronjob.sh: |
    #!/bin/bash

    # Copyright UiPath 2021
    #
    # =================
    # LICENSE AGREEMENT
    # -----------------
    #   Use of paid UiPath products and services is subject to the licensing agreement
    #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
    #   UiPath products is subject to the associated licensing agreement available here:
    #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
    #   You must not use this file separately from the product it is a part of or is associated with.

    set -eu -o pipefail

    function info() {
      echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
    }

    function warn() {
      echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error_without_exit() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
    }

    function error() {
      echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
      exit 1
    }

    alias kubectl='kubectl --cache-dir=/tmp/'
    IFS="," read -ra cronjobs <<<"$CRONJOB_LIST"

    for cr in "${cronjobs[@]}"; do
      #shellcheck disable=SC2206
      name=(${cr//// })
      cronNs=default
      cronName=""

      if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then
        error "Invalid cronjob name=$cr"
      fi

      if [[ ${#name[@]} -eq 2 ]]; then
        cronNs=${name[0]}
        cronName=${name[1]}
      else
        cronName=${name[0]}
      fi

      info "Validating cronjob=$cr"

      jobName="${cronName}-sf-job-validation"

      created=1
      info "Creating validation job for $cr"
      kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m

      #shellcheck disable=SC2086
      kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0

      if [[ $created == 0 ]]; then
        error "Failed to create job for $cr"
      fi

      #shellcheck disable=SC2086
      kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName &
      cpid=$!

      #shellcheck disable=SC2086
      kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 &
      fpid=$!

      ret=0
      wait -n $cpid $fpid || ret=1

      kill -9 $cpid || true
      kill -9 $fpid || true

      if [[ $ret -eq 0 ]]; then
        info "Job for $cr is validated/completed"
        #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation
        kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true
      else
        error "Job for $cr failed"
      fi
    done
kind: ConfigMap
metadata:
  name: thanos-cleaner-script
  namespace: cattle-monitoring-system
---
EOF

Sustituye SF_K8S_TAG por la etiqueta de imagen correcta y, a continuación, aplica el trabajo programado.

Desde el directorio del instalador en cualquier nodo del servidor, obtenga la última etiqueta:

cat versions/docker-images.json  |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1cat versions/docker-images.json  |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1

A continuación, actualiza el bloque cronjob sustituyendo SF_K8S_TAG por el valor devuelto.

Una vez actualizado, pegue el bloque completo en el terminal de cualquier nodo del servidor:

thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl  get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') &&  cat <<EOF | kubectl apply -f -
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: thanos-cleaner
  namespace: ${thanosns}
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 3
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      backoffLimit: 3
      template:
        metadata:
          annotations:
            sidecar.istio.io/inject: "false"
          creationTimestamp: null
          labels:
            app.kubernetes.io/name: thanos-cleaner-cronjob
        spec:
          containers:
          - args:
            - /script/thanos-cleanup.sh
            command:
            - /bin/bash
            env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: THANOS_CONFIG_KEY
              value: thanos.yaml
            - name: DISABLE_BLOCK_CLEANER
              value: "false"
            image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG
            imagePullPolicy: IfNotPresent
            name: thanos-cleaner
            resources:
              limits:
                cpu: 200m
                memory: 400Mi
              requests:
                cpu: 20m
                memory: 64Mi
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /script/
              name: script
            - mountPath: /etc/thanos/
              name: thanos-objectstore-vol
            - mountPath: /thanos-bin/
              name: thanos
            - mountPath: /.kube/
              name: kubedir
            - mountPath: /tmp/
              name: tmpdir
          dnsPolicy: ClusterFirst
          initContainers:
          - args:
            - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos
            command:
            - /bin/sh
            - -c
            image: ${thanosimage}
            imagePullPolicy: IfNotPresent
            name: copy-uipathcore-binary
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /thanos-bin/
              name: thanos
          nodeSelector:
            kubernetes.io/os: linux
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext:
            fsGroup: 3000
            runAsGroup: 2000
            runAsNonRoot: true
            runAsUser: 1000
          serviceAccount: thanos-cleaner
          serviceAccountName: thanos-cleaner
          terminationGracePeriodSeconds: 120
          volumes:
          - emptyDir: {}
            name: kubedir
          - emptyDir: {}
            name: tmpdir
          - emptyDir: {}
            name: thanos
          - name: thanos-objectstore-vol
            secret:
              defaultMode: 420
              secretName: thanos-objectstore-config
          - configMap:
              defaultMode: 420
              name: thanos-cleaner-script
            name: script
  schedule: 0 1/6 * * *
  successfulJobsHistoryLimit: 2
  suspend: false
---
EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl  get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') &&  cat <<EOF | kubectl apply -f -
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: thanos-cleaner
  namespace: ${thanosns}
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 3
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      backoffLimit: 3
      template:
        metadata:
          annotations:
            sidecar.istio.io/inject: "false"
          creationTimestamp: null
          labels:
            app.kubernetes.io/name: thanos-cleaner-cronjob
        spec:
          containers:
          - args:
            - /script/thanos-cleanup.sh
            command:
            - /bin/bash
            env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: THANOS_CONFIG_KEY
              value: thanos.yaml
            - name: DISABLE_BLOCK_CLEANER
              value: "false"
            image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG
            imagePullPolicy: IfNotPresent
            name: thanos-cleaner
            resources:
              limits:
                cpu: 200m
                memory: 400Mi
              requests:
                cpu: 20m
                memory: 64Mi
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /script/
              name: script
            - mountPath: /etc/thanos/
              name: thanos-objectstore-vol
            - mountPath: /thanos-bin/
              name: thanos
            - mountPath: /.kube/
              name: kubedir
            - mountPath: /tmp/
              name: tmpdir
          dnsPolicy: ClusterFirst
          initContainers:
          - args:
            - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos
            command:
            - /bin/sh
            - -c
            image: ${thanosimage}
            imagePullPolicy: IfNotPresent
            name: copy-uipathcore-binary
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /thanos-bin/
              name: thanos
          nodeSelector:
            kubernetes.io/os: linux
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext:
            fsGroup: 3000
            runAsGroup: 2000
            runAsNonRoot: true
            runAsUser: 1000
          serviceAccount: thanos-cleaner
          serviceAccountName: thanos-cleaner
          terminationGracePeriodSeconds: 120
          volumes:
          - emptyDir: {}
            name: kubedir
          - emptyDir: {}
            name: tmpdir
          - emptyDir: {}
            name: thanos
          - name: thanos-objectstore-vol
            secret:
              defaultMode: 420
              secretName: thanos-objectstore-config
          - configMap:
              defaultMode: 420
              name: thanos-cleaner-script
            name: script
  schedule: 0 1/6 * * *
  successfulJobsHistoryLimit: 2
  suspend: false
---
EOF