automation-suite
2023.10
false
Importante :
La localización de contenidos recién publicados puede tardar entre una y dos semanas en estar disponible.
UiPath logo, featuring letters U and I in white

Guía de instalación de Automation Suite en Linux

Última actualización 13 de nov. de 2025

Solución de problemas de almacenamiento

Error al compactar las métricas debido a bloques corruptos en Thanos

Descripción

El compactador de Thanos puede fallar al compactar las métricas cuando se detectan bloques corruptos en el almacén de objetos. Esta condición evita que el compactador procese las métricas, lo que conduce a un mayor uso del almacenamiento en el depósito de Ceph.

Solución

Para solucionar el problema, sigue los siguientes pasos:
  1. En cualquier nodo del servidor, ejecuta el siguiente script:
    thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f -
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      annotations:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role
      namespace: ${thanosns}
    rules:
    - apiGroups:
      - apps
      resources:
      - statefulsets
      - statefulsets/scale
      verbs:
      - list
      - get
      - update
      - patch
    - apiGroups:
      - batch
      resources:
      - jobs
      - cronjobs
      verbs:
      - delete
      - list
      - get
      - update
      - create
      - watch
    - apiGroups:
      - ""
      resources:
      - pods
      verbs:
      - delete
      - list
      - get
      - update
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role-binding
      namespace: ${thanosns}
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: thanos-cleaner-role
    subjects:
    - kind: ServiceAccount
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      name: thanos-cleaner
      namespace: uipath
    spec:
      groups:
      - name: thanos
        rules:
        - alert: ThanosCompactorNotWorking
          annotations:
            description: Thanos compactor is not working. This will disable metrics compaction
              in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace
              for any error. Compactor in faulty state will exhaust object store space
            message: Thanos compactor is not working. Please check if thanos cleaner job
              is functional and able to fix corruption
            runbook_url: https://docs.uipath.com/automation-suite/docs/alert-runbooks
            summary: Thanos compactor is not working
          expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1
          for: 1d
          labels:
            app: thanos
            severity: critical
    ---
    EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f -
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      annotations:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role
      namespace: ${thanosns}
    rules:
    - apiGroups:
      - apps
      resources:
      - statefulsets
      - statefulsets/scale
      verbs:
      - list
      - get
      - update
      - patch
    - apiGroups:
      - batch
      resources:
      - jobs
      - cronjobs
      verbs:
      - delete
      - list
      - get
      - update
      - create
      - watch
    - apiGroups:
      - ""
      resources:
      - pods
      verbs:
      - delete
      - list
      - get
      - update
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role-binding
      namespace: ${thanosns}
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: thanos-cleaner-role
    subjects:
    - kind: ServiceAccount
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      name: thanos-cleaner
      namespace: uipath
    spec:
      groups:
      - name: thanos
        rules:
        - alert: ThanosCompactorNotWorking
          annotations:
            description: Thanos compactor is not working. This will disable metrics compaction
              in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace
              for any error. Compactor in faulty state will exhaust object store space
            message: Thanos compactor is not working. Please check if thanos cleaner job
              is functional and able to fix corruption
            runbook_url: https://docs.uipath.com/es/automation-suite/docs/alert-runbooks
            summary: Thanos compactor is not working
          expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1
          for: 1d
          labels:
            app: thanos
            severity: critical
    ---
    EOF
  2. En cualquier nodo del servidor, ejecuta el siguiente script:
    cat <<'EOF' | kubectl apply -f -
    ---
    apiVersion: v1
    data:
      thanos-cleanup.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        export PATH=$PATH:/thanos-bin/
        # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction
        #
        # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module.
        # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod.
    
        config_file=/etc/thanos/${THANOS_CONFIG_KEY}
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        function is_compaction_halted() {
          info "Checking if thanos compactor running"
    
          IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")"
    
          is_compactor_halted=0
    
          if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then
            info "Thanos compactor pod is not running"
            is_compactor_halted=1
          fi
    
          for ip in "${compactor_addresses[@]}"; do
            #shellcheck disable=SC2086
            halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}')
            if [[ "$halted" -eq "1" ]]; then
              warn "Compaction is halted"
              is_compactor_halted=1
              break
            fi
          done
    
          return $is_compactor_halted
        }
    
        function execute_thanos_issue_command() {
          if [[ $# -ne 1 ]]; then
            error "missing issue name for execute_thanos_issue_command function"
          fi
    
          issue=$1
    
          info "Checking for issue $issue"
          cmd_ret=0
          #shellcheck disable=SC2086
          verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1
          if [[ $cmd_ret -eq 1 ]]; then
            error_without_exit "Output of $issue command: -> $verify_output"
            error "Failed to verify bucket for $issue"
          fi
    
          #shellcheck disable=SC2086
          echo $verify_output
        }
    
        function fix_index_issue() {
          info "Fixing index_known_issue issue"
    
          verify_output=$(execute_thanos_issue_command "index_known_issues")
          #shellcheck disable=SC2086
          for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do
            info "Block=$b is having the issue, removing it.."
    
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing index_known_issue issue done"
        }
    
        function fix_overlapping_issue() {
          info "Fixing overlapped_blocks issue"
    
          overlap_output=$(execute_thanos_issue_command "overlapped_blocks")
    
          while IFS= read -r line; do
            #shellcheck disable=SC2086
            for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do
              info "Block=$b is having the issue, removing it.."
              thanos tools bucket mark --id="$b" \
                --marker=deletion-mark.json \
                --details="deleted by job" \
                --objstore.config-file="${config_file}"
    
              info "Block=$b is marked for deletion"
            done
          done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks")
    
          info "Fixing overlapped_blocks issue done"
        }
    
        function fix_duplicate_issue() {
          info "Fixing duplicated_compaction issue"
          duplicate_output=$(execute_thanos_issue_command "duplicated_compaction")
          #shellcheck disable=SC2086,SC2006
          for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do
            info "Block=$b is having the issue, removing it.."
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing duplicated_compaction issue done"
        }
    
        if [[ -z "$NAMESPACE" ]]; then
          error "NAMESPACE is not set"
        fi
    
        # We will check if compaction is halted or not before checking for issues
        if is_compaction_halted; then
          info "Thanos compaction is working"
          echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
          exit 0
        fi
    
        warn "Thanos compactor is not working. Checking for corrupted blocks..."
        echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
    
        if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then
          info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean"
          exit 0
        fi
    
        info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks"
    
        replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}')
    
        # compactor must not be running while deleting blocks
    
        info "Stopping compactor"
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0
        kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force
    
        # fixing index_known_issues
        info "Checking blocks having issue"
    
        fix_index_issue
        fix_overlapping_issue
        fix_duplicate_issue
    
        info "Triggering deletion of all marked blocks"
    
        #shellcheck disable=SC2086
        thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file}
    
        info "Corrupted blocks are deleted"
    
        info "Scaling thanos compactor's replica to $replica"
        #shellcheck disable=SC2086
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica
        info "Thanos compactor started"
      validate-cronjob.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        alias kubectl='kubectl --cache-dir=/tmp/'
        IFS="," read -ra cronjobs <<<"$CRONJOB_LIST"
    
        for cr in "${cronjobs[@]}"; do
          #shellcheck disable=SC2206
          name=(${cr//// })
          cronNs=default
          cronName=""
    
          if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then
            error "Invalid cronjob name=$cr"
          fi
    
          if [[ ${#name[@]} -eq 2 ]]; then
            cronNs=${name[0]}
            cronName=${name[1]}
          else
            cronName=${name[0]}
          fi
    
          info "Validating cronjob=$cr"
    
          jobName="${cronName}-sf-job-validation"
    
          created=1
          info "Creating validation job for $cr"
          kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m
    
          #shellcheck disable=SC2086
          kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0
    
          if [[ $created == 0 ]]; then
            error "Failed to create job for $cr"
          fi
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName &
          cpid=$!
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 &
          fpid=$!
    
          ret=0
          wait -n $cpid $fpid || ret=1
    
          kill -9 $cpid || true
          kill -9 $fpid || true
    
          if [[ $ret -eq 0 ]]; then
            info "Job for $cr is validated/completed"
            #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation
            kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true
          else
            error "Job for $cr failed"
          fi
        done
    kind: ConfigMap
    metadata:
      name: thanos-cleaner-script
      namespace: cattle-monitoring-system
    ---
    EOFcat <<'EOF' | kubectl apply -f -
    ---
    apiVersion: v1
    data:
      thanos-cleanup.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        export PATH=$PATH:/thanos-bin/
        # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction
        #
        # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module.
        # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod.
    
        config_file=/etc/thanos/${THANOS_CONFIG_KEY}
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        function is_compaction_halted() {
          info "Checking if thanos compactor running"
    
          IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")"
    
          is_compactor_halted=0
    
          if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then
            info "Thanos compactor pod is not running"
            is_compactor_halted=1
          fi
    
          for ip in "${compactor_addresses[@]}"; do
            #shellcheck disable=SC2086
            halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}')
            if [[ "$halted" -eq "1" ]]; then
              warn "Compaction is halted"
              is_compactor_halted=1
              break
            fi
          done
    
          return $is_compactor_halted
        }
    
        function execute_thanos_issue_command() {
          if [[ $# -ne 1 ]]; then
            error "missing issue name for execute_thanos_issue_command function"
          fi
    
          issue=$1
    
          info "Checking for issue $issue"
          cmd_ret=0
          #shellcheck disable=SC2086
          verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1
          if [[ $cmd_ret -eq 1 ]]; then
            error_without_exit "Output of $issue command: -> $verify_output"
            error "Failed to verify bucket for $issue"
          fi
    
          #shellcheck disable=SC2086
          echo $verify_output
        }
    
        function fix_index_issue() {
          info "Fixing index_known_issue issue"
    
          verify_output=$(execute_thanos_issue_command "index_known_issues")
          #shellcheck disable=SC2086
          for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do
            info "Block=$b is having the issue, removing it.."
    
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing index_known_issue issue done"
        }
    
        function fix_overlapping_issue() {
          info "Fixing overlapped_blocks issue"
    
          overlap_output=$(execute_thanos_issue_command "overlapped_blocks")
    
          while IFS= read -r line; do
            #shellcheck disable=SC2086
            for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do
              info "Block=$b is having the issue, removing it.."
              thanos tools bucket mark --id="$b" \
                --marker=deletion-mark.json \
                --details="deleted by job" \
                --objstore.config-file="${config_file}"
    
              info "Block=$b is marked for deletion"
            done
          done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks")
    
          info "Fixing overlapped_blocks issue done"
        }
    
        function fix_duplicate_issue() {
          info "Fixing duplicated_compaction issue"
          duplicate_output=$(execute_thanos_issue_command "duplicated_compaction")
          #shellcheck disable=SC2086,SC2006
          for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do
            info "Block=$b is having the issue, removing it.."
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing duplicated_compaction issue done"
        }
    
        if [[ -z "$NAMESPACE" ]]; then
          error "NAMESPACE is not set"
        fi
    
        # We will check if compaction is halted or not before checking for issues
        if is_compaction_halted; then
          info "Thanos compaction is working"
          echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
          exit 0
        fi
    
        warn "Thanos compactor is not working. Checking for corrupted blocks..."
        echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
    
        if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then
          info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean"
          exit 0
        fi
    
        info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks"
    
        replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}')
    
        # compactor must not be running while deleting blocks
    
        info "Stopping compactor"
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0
        kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force
    
        # fixing index_known_issues
        info "Checking blocks having issue"
    
        fix_index_issue
        fix_overlapping_issue
        fix_duplicate_issue
    
        info "Triggering deletion of all marked blocks"
    
        #shellcheck disable=SC2086
        thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file}
    
        info "Corrupted blocks are deleted"
    
        info "Scaling thanos compactor's replica to $replica"
        #shellcheck disable=SC2086
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica
        info "Thanos compactor started"
      validate-cronjob.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        alias kubectl='kubectl --cache-dir=/tmp/'
        IFS="," read -ra cronjobs <<<"$CRONJOB_LIST"
    
        for cr in "${cronjobs[@]}"; do
          #shellcheck disable=SC2206
          name=(${cr//// })
          cronNs=default
          cronName=""
    
          if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then
            error "Invalid cronjob name=$cr"
          fi
    
          if [[ ${#name[@]} -eq 2 ]]; then
            cronNs=${name[0]}
            cronName=${name[1]}
          else
            cronName=${name[0]}
          fi
    
          info "Validating cronjob=$cr"
    
          jobName="${cronName}-sf-job-validation"
    
          created=1
          info "Creating validation job for $cr"
          kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m
    
          #shellcheck disable=SC2086
          kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0
    
          if [[ $created == 0 ]]; then
            error "Failed to create job for $cr"
          fi
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName &
          cpid=$!
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 &
          fpid=$!
    
          ret=0
          wait -n $cpid $fpid || ret=1
    
          kill -9 $cpid || true
          kill -9 $fpid || true
    
          if [[ $ret -eq 0 ]]; then
            info "Job for $cr is validated/completed"
            #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation
            kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true
          else
            error "Job for $cr failed"
          fi
        done
    kind: ConfigMap
    metadata:
      name: thanos-cleaner-script
      namespace: cattle-monitoring-system
    ---
    EOF
  3. Sustituye SF_K8S_TAG por la etiqueta de imagen correcta y, a continuación, aplica el trabajo programado.

    Desde el directorio del instalador en cualquier nodo del servidor, obtenga la última etiqueta:

    cat versions/docker-images.json  |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1cat versions/docker-images.json  |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1
    
    A continuación, actualiza el bloque cronjob sustituyendo SF_K8S_TAG por el valor devuelto.

    Una vez actualizado, pegue el bloque completo en el terminal de cualquier nodo del servidor:

    thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl  get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') &&  cat <<EOF | kubectl apply -f -
    ---
    apiVersion: batch/v1
    kind: CronJob
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    spec:
      concurrencyPolicy: Forbid
      failedJobsHistoryLimit: 3
      jobTemplate:
        metadata:
          creationTimestamp: null
        spec:
          backoffLimit: 3
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
              creationTimestamp: null
              labels:
                app.kubernetes.io/name: thanos-cleaner-cronjob
            spec:
              containers:
              - args:
                - /script/thanos-cleanup.sh
                command:
                - /bin/bash
                env:
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.namespace
                - name: THANOS_CONFIG_KEY
                  value: thanos.yaml
                - name: DISABLE_BLOCK_CLEANER
                  value: "false"
                image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG
                imagePullPolicy: IfNotPresent
                name: thanos-cleaner
                resources:
                  limits:
                    cpu: 200m
                    memory: 400Mi
                  requests:
                    cpu: 20m
                    memory: 64Mi
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /script/
                  name: script
                - mountPath: /etc/thanos/
                  name: thanos-objectstore-vol
                - mountPath: /thanos-bin/
                  name: thanos
                - mountPath: /.kube/
                  name: kubedir
                - mountPath: /tmp/
                  name: tmpdir
              dnsPolicy: ClusterFirst
              initContainers:
              - args:
                - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos
                command:
                - /bin/sh
                - -c
                image: ${thanosimage}
                imagePullPolicy: IfNotPresent
                name: copy-uipathcore-binary
                resources: {}
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /thanos-bin/
                  name: thanos
              nodeSelector:
                kubernetes.io/os: linux
              restartPolicy: Never
              schedulerName: default-scheduler
              securityContext:
                fsGroup: 3000
                runAsGroup: 2000
                runAsNonRoot: true
                runAsUser: 1000
              serviceAccount: thanos-cleaner
              serviceAccountName: thanos-cleaner
              terminationGracePeriodSeconds: 120
              volumes:
              - emptyDir: {}
                name: kubedir
              - emptyDir: {}
                name: tmpdir
              - emptyDir: {}
                name: thanos
              - name: thanos-objectstore-vol
                secret:
                  defaultMode: 420
                  secretName: thanos-objectstore-config
              - configMap:
                  defaultMode: 420
                  name: thanos-cleaner-script
                name: script
      schedule: 0 1/6 * * *
      successfulJobsHistoryLimit: 2
      suspend: false
    ---
    EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl  get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') &&  cat <<EOF | kubectl apply -f -
    ---
    apiVersion: batch/v1
    kind: CronJob
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    spec:
      concurrencyPolicy: Forbid
      failedJobsHistoryLimit: 3
      jobTemplate:
        metadata:
          creationTimestamp: null
        spec:
          backoffLimit: 3
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
              creationTimestamp: null
              labels:
                app.kubernetes.io/name: thanos-cleaner-cronjob
            spec:
              containers:
              - args:
                - /script/thanos-cleanup.sh
                command:
                - /bin/bash
                env:
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.namespace
                - name: THANOS_CONFIG_KEY
                  value: thanos.yaml
                - name: DISABLE_BLOCK_CLEANER
                  value: "false"
                image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG
                imagePullPolicy: IfNotPresent
                name: thanos-cleaner
                resources:
                  limits:
                    cpu: 200m
                    memory: 400Mi
                  requests:
                    cpu: 20m
                    memory: 64Mi
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /script/
                  name: script
                - mountPath: /etc/thanos/
                  name: thanos-objectstore-vol
                - mountPath: /thanos-bin/
                  name: thanos
                - mountPath: /.kube/
                  name: kubedir
                - mountPath: /tmp/
                  name: tmpdir
              dnsPolicy: ClusterFirst
              initContainers:
              - args:
                - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos
                command:
                - /bin/sh
                - -c
                image: ${thanosimage}
                imagePullPolicy: IfNotPresent
                name: copy-uipathcore-binary
                resources: {}
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /thanos-bin/
                  name: thanos
              nodeSelector:
                kubernetes.io/os: linux
              restartPolicy: Never
              schedulerName: default-scheduler
              securityContext:
                fsGroup: 3000
                runAsGroup: 2000
                runAsNonRoot: true
                runAsUser: 1000
              serviceAccount: thanos-cleaner
              serviceAccountName: thanos-cleaner
              terminationGracePeriodSeconds: 120
              volumes:
              - emptyDir: {}
                name: kubedir
              - emptyDir: {}
                name: tmpdir
              - emptyDir: {}
                name: thanos
              - name: thanos-objectstore-vol
                secret:
                  defaultMode: 420
                  secretName: thanos-objectstore-config
              - configMap:
                  defaultMode: 420
                  name: thanos-cleaner-script
                name: script
      schedule: 0 1/6 * * *
      successfulJobsHistoryLimit: 2
      suspend: false
    ---
    EOF
  • Error al compactar las métricas debido a bloques corruptos en Thanos
  • Descripción
  • Solución

¿Te ha resultado útil esta página?

Obtén la ayuda que necesitas
RPA para el aprendizaje - Cursos de automatización
Foro de la comunidad UiPath
Uipath Logo
Confianza y seguridad
© 2005-2025 UiPath. Todos los derechos reservados.