automation-suite
2023.10
false
- Overview
- Requirements
- Recommended: Deployment templates
- Manual: Preparing the installation
- Manual: Preparing the installation
- Step 1: Configuring the OCI-compliant registry for offline installations
- Step 2: Configuring the external objectstore
- Step 3: Configuring High Availability Add-on
- Step 4: Configuring Microsoft SQL Server
- Step 5: Configuring the load balancer
- Step 6: Configuring the DNS
- Step 7: Configuring the disks
- Step 8: Configuring kernel and OS level settings
- Step 9: Configuring the node ports
- Step 10: Applying miscellaneous settings
- Step 12: Validating and installing the required RPM packages
- Step 13: Generating cluster_config.json
- Certificate configuration
- Database configuration
- External Objectstore configuration
- Pre-signed URL configuration
- External OCI-compliant registry configuration
- Disaster recovery: Active/Passive and Active/Active configurations
- High Availability Add-on configuration
- Orchestrator-specific configuration
- Insights-specific configuration
- Process Mining-specific configuration
- Document Understanding-specific configuration
- Automation Suite Robots-specific configuration
- Monitoring configuration
- Optional: Configuring the proxy server
- Optional: Enabling resilience to zonal failures in a multi-node HA-ready production cluster
- Optional: Passing custom resolv.conf
- Optional: Increasing fault tolerance
- install-uipath.sh parameters
- Adding a dedicated agent node with GPU support
- Adding a dedicated agent Node for Task Mining
- Connecting Task Mining application
- Adding a Dedicated Agent Node for Automation Suite Robots
- Step 15: Configuring the temporary Docker registry for offline installations
- Step 16: Validating the prerequisites for the installation
- Manual: Performing the installation
- Post-installation
- Cluster administration
- Managing products
- Getting Started with the Cluster Administration portal
- Migrating objectstore from persistent volume to raw disks
- Migrating from in-cluster to external High Availability Add-on
- Migrating data between objectstores
- Migrating in-cluster objectstore to external objectstore
- Migrating to an external OCI-compliant registry
- Switching to the secondary cluster manually in an Active/Passive setup
- Disaster Recovery: Performing post-installation operations
- Converting an existing installation to multi-site setup
- Guidelines on upgrading an Active/Passive or Active/Active deployment
- Guidelines on backing up and restoring an Active/Passive or Active/Active deployment
- Redirecting traffic for the unsupported services to the primary cluster
- Monitoring and alerting
- Migration and upgrade
- Step 1: Moving the Identity organization data from standalone to Automation Suite
- Step 2: Restoring the standalone product database
- Step 3: Backing up the platform database in Automation Suite
- Step 4: Merging organizations in Automation Suite
- Step 5: Updating the migrated product connection strings
- Step 6: Migrating standalone Orchestrator
- Step 7: Migrating standalone Insights
- Step 8: Deleting the default tenant
- B) Single tenant migration
- Migrating from Automation Suite on Linux to Automation Suite on EKS/AKS
- Upgrading Automation Suite
- Downloading the installation packages and getting all the files on the first server node
- Retrieving the latest applied configuration from the cluster
- Updating the cluster configuration
- Configuring the OCI-compliant registry for offline installations
- Executing the upgrade
- Performing post-upgrade operations
- Product-specific configuration
- Using the Orchestrator Configurator Tool
- Configuring Orchestrator parameters
- Orchestrator appSettings
- Configuring appSettings
- Configuring the maximum request size
- Overriding cluster-level storage configuration
- Configuring credential stores
- Configuring encryption key per tenant
- Cleaning up the Orchestrator database
- Best practices and maintenance
- Troubleshooting
- How to troubleshoot services during installation
- How to uninstall the cluster
- How to clean up offline artifacts to improve disk space
- How to clear Redis data
- How to enable Istio logging
- How to manually clean up logs
- How to clean up old logs stored in the sf-logs bucket
- How to disable streaming logs for AI Center
- How to debug failed Automation Suite installations
- How to delete images from the old installer after upgrade
- How to disable TX checksum offloading
- How to upgrade from Automation Suite 2022.10.10 and 2022.4.11 to 2023.10.2
- How to manually set the ArgoCD log level to Info
- How to expand AI Center storage
- How to generate the encoded pull_secret_value for external registries
- How to address weak ciphers in TLS 1.2
- Unable to run an offline installation on RHEL 8.4 OS
- Error in downloading the bundle
- Offline installation fails because of missing binary
- Certificate issue in offline installation
- First installation fails during Longhorn setup
- SQL connection string validation error
- Prerequisite check for selinux iscsid module fails
- Azure disk not marked as SSD
- Failure after certificate update
- Antivirus causes installation issues
- Automation Suite not working after OS upgrade
- Automation Suite requires backlog_wait_time to be set to 0
- Volume unable to mount due to not being ready for workloads
- Support bundle log collection failure
- Test Automation SQL connection string is ignored
- Single-node upgrade fails at the fabric stage
- Cluster unhealthy after automated upgrade from 2021.10
- Upgrade fails due to unhealthy Ceph
- RKE2 not getting started due to space issue
- Volume unable to mount and remains in attach/detach loop state
- Upgrade fails due to classic objects in the Orchestrator database
- Ceph cluster found in a degraded state after side-by-side upgrade
- Unhealthy Insights component causes the migration to fail
- Service upgrade fails for Apps
- In-place upgrade timeouts
- Docker registry migration stuck in PVC deletion stage
- AI Center provisioning failure after upgrading to 2023.10 or later
- Upgrade fails in offline environments
- SQL validation fails during upgrade
- snapshot-controller-crds pod in CrashLoopBackOff state after upgrade
- Longhorn REST API endpoint upgrade/reinstall error
- Setting a timeout interval for the management portals
- Authentication not working after migration
- Kinit: Cannot find KDC for realm <AD Domain> while getting initial credentials
- Kinit: Keytab contains no suitable keys for *** while getting initial credentials
- GSSAPI operation failed due to invalid status code
- Alarm received for failed Kerberos-tgt-update job
- SSPI provider: Server not found in Kerberos database
- Login failed for AD user due to disabled account
- ArgoCD login failed
- Update the underlying directory connections
- Failure to get the sandbox image
- Pods not showing in ArgoCD UI
- Redis probe failure
- RKE2 server fails to start
- Secret not found in UiPath namespace
- ArgoCD goes into progressing state after first installation
- MongoDB pods in CrashLoopBackOff or pending PVC provisioning after deletion
- Unhealthy services after cluster restore or rollback
- Pods stuck in Init:0/X
- Missing Ceph-rook metrics from monitoring dashboards
- Pods cannot communicate with FQDN in a proxy environment
- Running High Availability with Process Mining
- Process Mining ingestion failed when logged in using Kerberos
- After Disaster Recovery Dapr is not working properly for Process Mining and Task Mining
- Unable to connect to AutomationSuite_ProcessMining_Warehouse database using a pyodbc format connection string
- Airflow installation fails with sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string ''
- How to add an IP table rule to use SQL Server port 1433
- Running the diagnostics tool
- Using the Automation Suite support bundle
- Exploring Logs
How to clean up old logs stored in the sf-logs bucket
Automation Suite on Linux Installation Guide
Last updated Dec 3, 2024
How to clean up old logs stored in the sf-logs bucket
A bug might cause log accumulation in the
sf-logs
object store bucket. To clean up old logs in the sf-logs
bucket, follow the instructions on running the dedicated script. Make sure to follow the steps relevant to your environment
type.
To clean up old logs stored in the
sf-logs
bucket, take the following steps:
-
Get the version of the
sf-k8-utils-rhel
image available in your environment:- in an offline environment, run the following command:
podman search localhost:30071/uipath/sf-k8-utils-rhel --tls-verify=false --list-tags
- in an online environment, run the following command:
podman search registry.uipath.com/uipath/sf-k8-utils-rhel --list-tags
- in an offline environment, run the following command:
-
Update line 121 in the following
yaml
definition accordingly to include the proper image tag:apiVersion: v1 kind: ConfigMap metadata: name: cleanup-script namespace: uipath-infra data: cleanup_old_logs.sh: | #!/bin/bash function parse_args() { CUTOFFDAY=7 SKIPDRYRUN=0 while getopts 'c:sh' flag "$@"; do case "${flag}" in c) CUTOFFDAY=${OPTARG} ;; s) SKIPDRYRUN=1 ;; h) display_usage exit 0 ;; *) echo "Unexpected option ${flag}" display_usage exit 1 ;; esac done shift $((OPTIND - 1)) } function display_usage() { echo "usage: $(basename "$0") -c <number> [-s]" echo " -s skip dry run, Really deletes the log dirs" echo " -c logs older than how many days to be deleted. Default is 7 days" echo " -h help" echo "NOTE: Default is dry run, to really delete logs set -s" } function setS3CMDContext() { OBJECT_GATEWAY_INTERNAL_HOST=$(kubectl -n rook-ceph get services/rook-ceph-rgw-rook-ceph -o jsonpath="{.spec.clusterIP}") OBJECT_GATEWAY_INTERNAL_PORT=$(kubectl -n rook-ceph get services/rook-ceph-rgw-rook-ceph -o jsonpath="{.spec.ports[0].port}") AWS_ACCESS_KEY=$1 AWS_SECRET_KEY=$2 # Reference https://rook.io/docs/rook/v1.5/ceph-object.html#consume-the-object-storage export AWS_HOST=$OBJECT_GATEWAY_INTERNAL_HOST export AWS_ENDPOINT=$OBJECT_GATEWAY_INTERNAL_HOST:$OBJECT_GATEWAY_INTERNAL_PORT export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY } # Set s3cmd context by passing correct AccessKey and SecretKey function setS3CMDContextForLogs() { BUCKET_NAME='sf-logs' AWS_ACCESS_KEY=$(kubectl -n cattle-logging-system get secret s3-store-secret -o json | jq '.data.OBJECT_STORAGE_ACCESSKEY' | sed -e 's/^"//' -e 's/"$//' | base64 -d) AWS_SECRET_KEY=$(kubectl -n cattle-logging-system get secret s3-store-secret -o json | jq '.data.OBJECT_STORAGE_SECRETKEY' | sed -e 's/^"//' -e 's/"$//' | base64 -d) setS3CMDContext "$AWS_ACCESS_KEY" "$AWS_SECRET_KEY" } function delete_old_logs() { local cutoffdate=$1 days=$(s3cmd ls s3://sf-logs/ --host="${AWS_HOST}" --host-bucket= s3://sf-logs --no-check-certificate --no-ssl) days=${days//DIR} if [[ $SKIPDRYRUN -eq 0 ]]; then echo "DRY RUN. Following log dirs are selected for deletion" fi for day in $days do day=${day#*sf-logs/} day=${day::-1} if [[ ${day} < ${cutoffdate} ]]; then if [[ $SKIPDRYRUN -eq 0 ]]; then echo "s3://$BUCKET_NAME/$day" else echo "###############################################################" echo "Deleting Logs for day: {$day}" echo "###############################################################" s3cmd del "s3://$BUCKET_NAME/$day/" --host="${AWS_HOST}" --host-bucket= --no-ssl --recursive || true fi fi done } function main() { # Set S3 context by setting correct env variables setS3CMDContextForLogs echo "Bucket name is $BUCKET_NAME" CUTOFFDATE=$(date --date="${CUTOFFDAY} day ago" +%Y_%m_%d) echo "logs older than ${CUTOFFDATE} will be deleted" delete_old_logs "${CUTOFFDATE}" if [[ $SKIPDRYRUN -eq 0 ]]; then echo "NOTE: For really deleting the old log directories run with -s option" fi } parse_args "$@" main exit 0 --- apiVersion: v1 kind: Pod metadata: name: cleanup-old-logs namespace: uipath-infra spec: serviceAccountName: fluentd-logs-cleanup-sa containers: - name: cleanup image: localhost:30071/uipath/sf-k8-utils-rhel:0.8 command: ["/bin/bash"] args: ["/scripts-dir/cleanup_old_logs.sh", "-s"] volumeMounts: - name: scripts-vol mountPath: /scripts-dir securityContext: privileged: false allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 9999 runAsGroup: 9999 runAsNonRoot: true capabilities: drop: ["NET_RAW"] volumes: - name: scripts-vol configMap: name: cleanup-script
apiVersion: v1 kind: ConfigMap metadata: name: cleanup-script namespace: uipath-infra data: cleanup_old_logs.sh: | #!/bin/bash function parse_args() { CUTOFFDAY=7 SKIPDRYRUN=0 while getopts 'c:sh' flag "$@"; do case "${flag}" in c) CUTOFFDAY=${OPTARG} ;; s) SKIPDRYRUN=1 ;; h) display_usage exit 0 ;; *) echo "Unexpected option ${flag}" display_usage exit 1 ;; esac done shift $((OPTIND - 1)) } function display_usage() { echo "usage: $(basename "$0") -c <number> [-s]" echo " -s skip dry run, Really deletes the log dirs" echo " -c logs older than how many days to be deleted. Default is 7 days" echo " -h help" echo "NOTE: Default is dry run, to really delete logs set -s" } function setS3CMDContext() { OBJECT_GATEWAY_INTERNAL_HOST=$(kubectl -n rook-ceph get services/rook-ceph-rgw-rook-ceph -o jsonpath="{.spec.clusterIP}") OBJECT_GATEWAY_INTERNAL_PORT=$(kubectl -n rook-ceph get services/rook-ceph-rgw-rook-ceph -o jsonpath="{.spec.ports[0].port}") AWS_ACCESS_KEY=$1 AWS_SECRET_KEY=$2 # Reference https://rook.io/docs/rook/v1.5/ceph-object.html#consume-the-object-storage export AWS_HOST=$OBJECT_GATEWAY_INTERNAL_HOST export AWS_ENDPOINT=$OBJECT_GATEWAY_INTERNAL_HOST:$OBJECT_GATEWAY_INTERNAL_PORT export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_KEY } # Set s3cmd context by passing correct AccessKey and SecretKey function setS3CMDContextForLogs() { BUCKET_NAME='sf-logs' AWS_ACCESS_KEY=$(kubectl -n cattle-logging-system get secret s3-store-secret -o json | jq '.data.OBJECT_STORAGE_ACCESSKEY' | sed -e 's/^"//' -e 's/"$//' | base64 -d) AWS_SECRET_KEY=$(kubectl -n cattle-logging-system get secret s3-store-secret -o json | jq '.data.OBJECT_STORAGE_SECRETKEY' | sed -e 's/^"//' -e 's/"$//' | base64 -d) setS3CMDContext "$AWS_ACCESS_KEY" "$AWS_SECRET_KEY" } function delete_old_logs() { local cutoffdate=$1 days=$(s3cmd ls s3://sf-logs/ --host="${AWS_HOST}" --host-bucket= s3://sf-logs --no-check-certificate --no-ssl) days=${days//DIR} if [[ $SKIPDRYRUN -eq 0 ]]; then echo "DRY RUN. Following log dirs are selected for deletion" fi for day in $days do day=${day#*sf-logs/} day=${day::-1} if [[ ${day} < ${cutoffdate} ]]; then if [[ $SKIPDRYRUN -eq 0 ]]; then echo "s3://$BUCKET_NAME/$day" else echo "###############################################################" echo "Deleting Logs for day: {$day}" echo "###############################################################" s3cmd del "s3://$BUCKET_NAME/$day/" --host="${AWS_HOST}" --host-bucket= --no-ssl --recursive || true fi fi done } function main() { # Set S3 context by setting correct env variables setS3CMDContextForLogs echo "Bucket name is $BUCKET_NAME" CUTOFFDATE=$(date --date="${CUTOFFDAY} day ago" +%Y_%m_%d) echo "logs older than ${CUTOFFDATE} will be deleted" delete_old_logs "${CUTOFFDATE}" if [[ $SKIPDRYRUN -eq 0 ]]; then echo "NOTE: For really deleting the old log directories run with -s option" fi } parse_args "$@" main exit 0 --- apiVersion: v1 kind: Pod metadata: name: cleanup-old-logs namespace: uipath-infra spec: serviceAccountName: fluentd-logs-cleanup-sa containers: - name: cleanup image: localhost:30071/uipath/sf-k8-utils-rhel:0.8 command: ["/bin/bash"] args: ["/scripts-dir/cleanup_old_logs.sh", "-s"] volumeMounts: - name: scripts-vol mountPath: /scripts-dir securityContext: privileged: false allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 9999 runAsGroup: 9999 runAsNonRoot: true capabilities: drop: ["NET_RAW"] volumes: - name: scripts-vol configMap: name: cleanup-script -
Copy the content of the aforementioned
yaml
definition to a file calledcleanup.yaml
. Trigger a pod to clean up the old logs:kubectl apply -f cleanup.yaml
kubectl apply -f cleanup.yaml -
Get details on the progress:
kubectl -n uipath-infra logs cleanup-old-logs -f
kubectl -n uipath-infra logs cleanup-old-logs -f -
Delete the job:
kubectl delete -f cleanup.yaml
kubectl delete -f cleanup.yaml