automation-suite
2024.10
true
- Overview
- Requirements
- Recommended: Deployment templates
- Manual: Preparing the installation
- Manual: Preparing the installation
- Step 1: Configuring the OCI-compliant registry for offline installations
- Step 2: Configuring the external objectstore
- Step 3: Configuring High Availability Add-on
- Step 4: Configuring Microsoft SQL Server
- Step 5: Configuring the load balancer
- Step 6: Configuring the DNS
- Step 7: Configuring the disks
- Step 8: Configuring kernel and OS level settings
- Step 9: Configuring the node ports
- Step 10: Applying miscellaneous settings
- Step 12: Validating and installing the required RPM packages
- Step 13: Generating cluster_config.json
- Cluster_config.json Sample
- General configuration
- Profile configuration
- Certificate configuration
- Database configuration
- External Objectstore configuration
- Pre-signed URL configuration
- ArgoCD configuration
- External OCI-compliant registry configuration
- Disaster recovery: Active/Passive and Active/Active configurations
- High Availability Add-on configuration
- Orchestrator-specific configuration
- Insights-specific configuration
- Process Mining-specific configuration
- Document Understanding-specific configuration
- Automation Suite Robots-specific configuration
- AI Center-specific configuration
- Monitoring configuration
- Optional: Configuring the proxy server
- Optional: Enabling resilience to zonal failures in a multi-node HA-ready production cluster
- Optional: Passing custom resolv.conf
- Optional: Increasing fault tolerance
- Adding a dedicated agent node with GPU support
- Adding a dedicated agent Node for Task Mining
- Connecting Task Mining application
- Adding a Dedicated Agent Node for Automation Suite Robots
- Step 15: Configuring the temporary Docker registry for offline installations
- Step 16: Validating the prerequisites for the installation
- Manual: Performing the installation
- Post-installation
- Cluster administration
- Managing products
- Getting Started with the Cluster Administration portal
- Migrating objectstore from persistent volume to raw disks
- Migrating from in-cluster to external High Availability Add-on
- Migrating data between objectstores
- Migrating in-cluster objectstore to external objectstore
- Migrating to an external OCI-compliant registry
- Switching to the secondary cluster manually in an Active/Passive setup
- Disaster Recovery: Performing post-installation operations
- Converting an existing installation to multi-site setup
- Guidelines on upgrading an Active/Passive or Active/Active deployment
- Guidelines on backing up and restoring an Active/Passive or Active/Active deployment
- Monitoring and alerting
- Migration and upgrade
- Migrating between Automation Suite clusters
- Upgrading Automation Suite
- Downloading the installation packages and getting all the files on the first server node
- Retrieving the latest applied configuration from the cluster
- Updating the cluster configuration
- Configuring the OCI-compliant registry for offline installations
- Executing the upgrade
- Performing post-upgrade operations
- Applying a patch
- Product-specific configuration
- Best practices and maintenance
- Troubleshooting
- How to troubleshoot services during installation
- How to uninstall the cluster
- How to clean up offline artifacts to improve disk space
- How to clear Redis data
- How to enable Istio logging
- How to manually clean up logs
- How to clean up old logs stored in the sf-logs bucket
- How to disable streaming logs for AI Center
- How to debug failed Automation Suite installations
- How to delete images from the old installer after upgrade
- How to disable TX checksum offloading
- How to manually set the ArgoCD log level to Info
- How to expand AI Center storage
- How to generate the encoded pull_secret_value for external registries
- How to address weak ciphers in TLS 1.2
- How to check the TLS version
- How to schedule Ceph backup and restore data
- Unable to run an offline installation on RHEL 8.4 OS
- Error in downloading the bundle
- Offline installation fails because of missing binary
- Certificate issue in offline installation
- SQL connection string validation error
- Prerequisite check for selinux iscsid module fails
- Azure disk not marked as SSD
- Failure after certificate update
- Antivirus causes installation issues
- Automation Suite not working after OS upgrade
- Automation Suite requires backlog_wait_time to be set to 0
- Volume unable to mount due to not being ready for workloads
- Support bundle log collection failure
- Data loss when reinstalling or upgrading Insights following Automation Suite upgrade
- Unable to access Automation Hub following upgrade to Automation Suite 2024.10.0
- Single-node upgrade fails at the fabric stage
- Upgrade fails due to unhealthy Ceph
- RKE2 not getting started due to space issue
- Volume unable to mount and remains in attach/detach loop state
- Upgrade fails due to classic objects in the Orchestrator database
- Ceph cluster found in a degraded state after side-by-side upgrade
- Unhealthy Insights component causes the migration to fail
- Service upgrade fails for Apps
- In-place upgrade timeouts
- Docker registry migration stuck in PVC deletion stage
- AI Center provisioning failure after upgrading to 2023.10 or later
- Upgrade fails in offline environments
- SQL validation fails during upgrade
- snapshot-controller-crds pod in CrashLoopBackOff state after upgrade
- Setting a timeout interval for the management portals
- Authentication not working after migration
- Kinit: Cannot find KDC for realm <AD Domain> while getting initial credentials
- Kinit: Keytab contains no suitable keys for *** while getting initial credentials
- GSSAPI operation failed due to invalid status code
- Alarm received for failed Kerberos-tgt-update job
- SSPI provider: Server not found in Kerberos database
- Login failed for AD user due to disabled account
- ArgoCD login failed
- Update the underlying directory connections
- Partial failure to restore backup in Automation Suite 2024.10.0
- Failure to get the sandbox image
- Pods not showing in ArgoCD UI
- Redis probe failure
- RKE2 server fails to start
- Secret not found in UiPath namespace
- ArgoCD goes into progressing state after first installation
- MongoDB pods in CrashLoopBackOff or pending PVC provisioning after deletion
- Pods stuck in Init:0/X
- Missing Ceph-rook metrics from monitoring dashboards
- Running High Availability with Process Mining
- Process Mining ingestion failed when logged in using Kerberos
- After Disaster Recovery Dapr is not working properly for Process Mining
- Unable to connect to AutomationSuite_ProcessMining_Warehouse database using a pyodbc format connection string
- Airflow installation fails with sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string ''
- How to add an IP table rule to use SQL Server port 1433
- Automation Suite certificate is not trusted from the server where CData Sync is running
- Task Mining troubleshooting
- Running the diagnostics tool
- Using the Automation Suite support bundle
- Exploring Logs
In-place upgrade timeouts
Automation Suite on Linux Installation Guide
Last updated Jan 22, 2025
In-place upgrade timeouts
The in-place upgrade timeouts with
Error: cannot execute upgrade plan on agent nodes: context canceled
. However, all nodes were upgraded, but the main logs indicate an upgrade timeout.
For example:
-
The upgrade log for the first node shows a failed upgrade:
[INFO] [2023-10-11T12:51:12+0000]: Running upgrade on all nodes... Error: cannot execute upgrade plan on agent nodes: context canceled [INFO] [2023-10-11T13:51:12+0000]: Node details: [INFO] [2023-10-11T13:51:12+0000]: NAME STATUS ROLES AGE VERSION agent0 Ready <none> 6h44m v1.26.5+rke2r1 server0 Ready control-plane,etcd,master 7h18m v1.26.5+rke2r1 server1 Ready control-plane,etcd,master 7h2m v1.26.5+rke2r1 server2 Ready control-plane,etcd,master 6h53m v1.26.5+rke2r1 [INFO] [2023-10-11T13:51:12+0000]: Refer the log files in /opt/UiPathAutomationSuite/_upgrade/UiPath_Installer/Modules/../upgrade-logs/upgrade-2023.10.0-rc.12 [INFO] [2023-10-11T13:51:16+0000]: Rke upgrade information is available at system-upgrade namespace. Use kubectl command to get the logs and events. [INFO] [2023-10-11T13:51:16+0000]: Logs also present in the corresponding node at /opt/UiPathAutomationSuite/_autoupgrade ^[[0;31m[ERROR][2023-10-11T13:51:16+0000]:^[[0m Upgrade failed. Please fix the errors and try again.
[INFO] [2023-10-11T12:51:12+0000]: Running upgrade on all nodes... Error: cannot execute upgrade plan on agent nodes: context canceled [INFO] [2023-10-11T13:51:12+0000]: Node details: [INFO] [2023-10-11T13:51:12+0000]: NAME STATUS ROLES AGE VERSION agent0 Ready <none> 6h44m v1.26.5+rke2r1 server0 Ready control-plane,etcd,master 7h18m v1.26.5+rke2r1 server1 Ready control-plane,etcd,master 7h2m v1.26.5+rke2r1 server2 Ready control-plane,etcd,master 6h53m v1.26.5+rke2r1 [INFO] [2023-10-11T13:51:12+0000]: Refer the log files in /opt/UiPathAutomationSuite/_upgrade/UiPath_Installer/Modules/../upgrade-logs/upgrade-2023.10.0-rc.12 [INFO] [2023-10-11T13:51:16+0000]: Rke upgrade information is available at system-upgrade namespace. Use kubectl command to get the logs and events. [INFO] [2023-10-11T13:51:16+0000]: Logs also present in the corresponding node at /opt/UiPathAutomationSuite/_autoupgrade ^[[0;31m[ERROR][2023-10-11T13:51:16+0000]:^[[0m Upgrade failed. Please fix the errors and try again. -
The upgrade log for the agent node shows a succesful upgrade:
[INFO] [2023-10-12T12:42:53+0000]: Infra installed successfully [INFO] [2023-10-12T12:42:53+0000]: Validate rke2 images update in the upgrade hook [INFO] [2023-10-12T12:42:53+0000]: checking all rke2 pods are up and running in the current node pod/kube-proxy-agent0 condition met [INFO] [2023-10-12T12:42:54+0000]: Uncordoning node in the upgrade hook Checking if node is ready to run kubectl command. Node is ready to accept kubectl command Enable IP Forwarding... Either file /etc/sysctl.conf not present or file is not writable. Enabling ip forward using /proc/sys/net/ipv4/ip_forward... Uncordon agent0 ... label "nodejanitor/skip" not found. node/agent0 not labeled node/agent0 already uncordoned node/agent0 annotated [INFO] [2023-10-12T12:42:54+0000]: Upgrade successfully completed.
[INFO] [2023-10-12T12:42:53+0000]: Infra installed successfully [INFO] [2023-10-12T12:42:53+0000]: Validate rke2 images update in the upgrade hook [INFO] [2023-10-12T12:42:53+0000]: checking all rke2 pods are up and running in the current node pod/kube-proxy-agent0 condition met [INFO] [2023-10-12T12:42:54+0000]: Uncordoning node in the upgrade hook Checking if node is ready to run kubectl command. Node is ready to accept kubectl command Enable IP Forwarding... Either file /etc/sysctl.conf not present or file is not writable. Enabling ip forward using /proc/sys/net/ipv4/ip_forward... Uncordon agent0 ... label "nodejanitor/skip" not found. node/agent0 not labeled node/agent0 already uncordoned node/agent0 annotated [INFO] [2023-10-12T12:42:54+0000]: Upgrade successfully completed.