- Overview
- Requirements
- Recommended: Deployment templates
- Manual: Preparing the installation
- Manual: Preparing the installation
- Step 1: Configuring the OCI-compliant registry for offline installations
- Step 2: Configuring the external objectstore
- Step 3: Configuring High Availability Add-on
- Step 4: Configuring Microsoft SQL Server
- Step 5: Configuring the load balancer
- Step 6: Configuring the DNS
- Step 7: Configuring the disks
- Step 8: Configuring kernel and OS level settings
- Step 9: Configuring the node ports
- Step 10: Applying miscellaneous settings
- Step 12: Validating and installing the required RPM packages
- Step 13: Generating cluster_config.json
- Certificate configuration
- Database configuration
- External Objectstore configuration
- Pre-signed URL configuration
- External OCI-compliant registry configuration
- Disaster recovery: Active/Passive and Active/Active configurations
- High Availability Add-on configuration
- Orchestrator-specific configuration
- Insights-specific configuration
- Process Mining-specific configuration
- Document Understanding-specific configuration
- Automation Suite Robots-specific configuration
- Monitoring configuration
- Optional: Configuring the proxy server
- Optional: Enabling resilience to zonal failures in a multi-node HA-ready production cluster
- Optional: Passing custom resolv.conf
- Optional: Increasing fault tolerance
- install-uipath.sh parameters
- Adding a dedicated agent node with GPU support
- Adding a dedicated agent Node for Task Mining
- Connecting Task Mining application
- Adding a Dedicated Agent Node for Automation Suite Robots
- Step 15: Configuring the temporary Docker registry for offline installations
- Step 16: Validating the prerequisites for the installation
- Manual: Performing the installation
- Post-installation
- Cluster administration
- Managing products
- Getting Started with the Cluster Administration portal
- Migrating objectstore from persistent volume to raw disks
- Migrating from in-cluster to external High Availability Add-on
- Migrating data between objectstores
- Migrating in-cluster objectstore to external objectstore
- Migrating to an external OCI-compliant registry
- Switching to the secondary cluster manually in an Active/Passive setup
- Disaster Recovery: Performing post-installation operations
- Converting an existing installation to multi-site setup
- Guidelines on upgrading an Active/Passive or Active/Active deployment
- Guidelines on backing up and restoring an Active/Passive or Active/Active deployment
- Redirecting traffic for the unsupported services to the primary cluster
- Monitoring and alerting
- Migration and upgrade
- Step 1: Moving the Identity organization data from standalone to Automation Suite
- Step 2: Restoring the standalone product database
- Step 3: Backing up the platform database in Automation Suite
- Step 4: Merging organizations in Automation Suite
- Step 5: Updating the migrated product connection strings
- Step 6: Migrating standalone Orchestrator
- Step 7: Migrating standalone Insights
- Step 8: Deleting the default tenant
- B) Single tenant migration
- Migrating from Automation Suite on Linux to Automation Suite on EKS/AKS
- Upgrading Automation Suite
- Downloading the installation packages and getting all the files on the first server node
- Retrieving the latest applied configuration from the cluster
- Updating the cluster configuration
- Configuring the OCI-compliant registry for offline installations
- Executing the upgrade
- Performing post-upgrade operations
- Product-specific configuration
- Using the Orchestrator Configurator Tool
- Configuring Orchestrator parameters
- Orchestrator appSettings
- Configuring appSettings
- Configuring the maximum request size
- Overriding cluster-level storage configuration
- Configuring credential stores
- Configuring encryption key per tenant
- Cleaning up the Orchestrator database
- Best practices and maintenance
- Troubleshooting
- How to troubleshoot services during installation
- How to uninstall the cluster
- How to clean up offline artifacts to improve disk space
- How to clear Redis data
- How to enable Istio logging
- How to manually clean up logs
- How to clean up old logs stored in the sf-logs bucket
- How to disable streaming logs for AI Center
- How to debug failed Automation Suite installations
- How to delete images from the old installer after upgrade
- How to disable TX checksum offloading
- How to upgrade from Automation Suite 2022.10.10 and 2022.4.11 to 2023.10.2
- How to manually set the ArgoCD log level to Info
- How to expand AI Center storage
- How to generate the encoded pull_secret_value for external registries
- How to address weak ciphers in TLS 1.2
- Unable to run an offline installation on RHEL 8.4 OS
- Error in downloading the bundle
- Offline installation fails because of missing binary
- Certificate issue in offline installation
- First installation fails during Longhorn setup
- SQL connection string validation error
- Prerequisite check for selinux iscsid module fails
- Azure disk not marked as SSD
- Failure after certificate update
- Antivirus causes installation issues
- Automation Suite not working after OS upgrade
- Automation Suite requires backlog_wait_time to be set to 0
- Volume unable to mount due to not being ready for workloads
- Support bundle log collection failure
- Test Automation SQL connection string is ignored
- Single-node upgrade fails at the fabric stage
- Cluster unhealthy after automated upgrade from 2021.10
- Upgrade fails due to unhealthy Ceph
- RKE2 not getting started due to space issue
- Volume unable to mount and remains in attach/detach loop state
- Upgrade fails due to classic objects in the Orchestrator database
- Ceph cluster found in a degraded state after side-by-side upgrade
- Unhealthy Insights component causes the migration to fail
- Service upgrade fails for Apps
- In-place upgrade timeouts
- Docker registry migration stuck in PVC deletion stage
- AI Center provisioning failure after upgrading to 2023.10 or later
- Upgrade fails in offline environments
- SQL validation fails during upgrade
- snapshot-controller-crds pod in CrashLoopBackOff state after upgrade
- Longhorn REST API endpoint upgrade/reinstall error
- Setting a timeout interval for the management portals
- Authentication not working after migration
- Kinit: Cannot find KDC for realm <AD Domain> while getting initial credentials
- Kinit: Keytab contains no suitable keys for *** while getting initial credentials
- GSSAPI operation failed due to invalid status code
- Alarm received for failed Kerberos-tgt-update job
- SSPI provider: Server not found in Kerberos database
- Login failed for AD user due to disabled account
- ArgoCD login failed
- Update the underlying directory connections
- Failure to get the sandbox image
- Pods not showing in ArgoCD UI
- Redis probe failure
- RKE2 server fails to start
- Secret not found in UiPath namespace
- ArgoCD goes into progressing state after first installation
- MongoDB pods in CrashLoopBackOff or pending PVC provisioning after deletion
- Unhealthy services after cluster restore or rollback
- Pods stuck in Init:0/X
- Missing Ceph-rook metrics from monitoring dashboards
- Pods cannot communicate with FQDN in a proxy environment
- Running High Availability with Process Mining
- Process Mining ingestion failed when logged in using Kerberos
- After Disaster Recovery Dapr is not working properly for Process Mining and Task Mining
- Unable to connect to AutomationSuite_ProcessMining_Warehouse database using a pyodbc format connection string
- Airflow installation fails with sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string ''
- How to add an IP table rule to use SQL Server port 1433
- Running the diagnostics tool
- Using the Automation Suite support bundle
- Exploring Logs
Deployment architecture
For more information on the core concepts used in an Automation Suite deployment, refer to Glossary.
Automation Suite supports the following two deployment modes:
Deployment mode |
Description |
---|---|
Single-node — evaluation |
Supported for evaluation and demo scenarios. |
Multi-node — production, HA-enabled |
Supported for production use. You can perform additional configuration post-deployment to have full HA capabilities. |
See Supported use cases for single-node and multi-node installations for more details on how to choose the deployment mode that you best suit your needs.
This page offers insight into the Automation Suite architecture and describes the components bundled into the installer.
A server node hosts cluster management services (control plane) that perform important cluster operations such as workload orchestration, cluster state management, load balance incoming requests, etc. Kubernetes may also run a few of the UiPath® products and shared components based on underlying resource availability.
An agent node is responsible for running the UiPath® products and shared components only.
A specialized agent node runs special workloads like Task Mining analysis, Document Understanding pipelines that require GPU capability, or Automation Suite Robots. However, the core Task Mining, Document Understanding, or Automation Suite Robots services still run on the server or agent nodes. Specialized agent nodes do not host any of the UiPath® product or shared components.
A single-node evaluation deployment here means a single-server node. This does not imply the deployment of the entire Automation Suite on a single machine. You may have to add additional agent or specialized agent nodes if the entire product suite cannot fit in a single server node, or if you want to run special tasks like Task Mining analysis and Document understanding pipelines, which require GPU capabilities.
A multi-node HA-ready production deployment involves 3 or more server nodes behind a load balancer. This is to ensure that, in the event of disaster, when any of the server nodes goes down, Automation suite is still available to perform critical business workflows. The number of agent nodes is optional and is based on actual usage.
In a multi-node setup, High Availability (HA) is enabled by default. However, the Redis-based in-memory cache used by cluster services is running on a single pod and represents a single point of failure. To mitigate the impact of a cache node failure or restart, you can purchase the High Availability Add-on (HAA), which enables redundant, multi-pod deployment of the cache.
For more details on how to enable HAA in a multi-node setup, see Enabling High Availability Add-on for the cluster.
An online deployment means Automation Suite requires access to the internet during both installation and runtime. All the UiPath® products and supporting libraries are hosted either in UiPath® registry or UiPath-trusted third party store.
You can restrict access to the internet with the help of either a restricted firewall or a proxy server, by blocking all the traffic over the internet other than what is required by Automation Suite. This type of setup is also known as semi-online deployment. For more details, see Configuring the firewall and Configuring the proxy server.
These types of deployments are easier, faster, and require fewer hardware resources to install and manage as compared to offline deployments.
An offline deployment (air-gapped) is a completely isolated setup without access to the internet. This kind of setup requires the installation of an additional registry to store all the UiPath® products' container images and binaries, which are shipped in the form of tarball.
Uploading binaries (hydration) to the registry introduces additional hardware requirements and installation complexity, increasing the time required to perform an installation as compared to an online deployment.
An offline installation increases not only the complexity during installation, but also the cluster management operations like machine maintenance, disaster recovery, upgrading to newer versions, applying security patches, etc.
You are not allowed to change the deployment method post-installation. This means that you cannot change to offline if the installation is done online and vice versa. It is recommended to choose your deployment strategy after careful consideration.
The following table lists out the third-party components shipped with Automation Suite:
Component |
Optional/Required |
Description |
---|---|---|
RKE2 |
Required |
Rancher-provided Kubernetes distribution. It is the container orchestration platform that runs all the architectural components and services. |
CEPH Object Store |
Optional if you have an external objectstore |
Open-source storage provider that exposes Amazon S3-compliant object/blob storage. It enables services to use blob storage like functionality for their operations. |
Argo CD |
Required |
Open-source declarative CD tool for Kubernetes. It follows the GitOps pattern of using Git repositories as the source of truth for defining the desired application state. It provides application lifecycle management (ALM) capabilities for Automation Suite components and UiPath® services that run in a Kubernetes cluster. |
Docker registry |
Optional if you have an external registry |
Open-source docker registry used for pushing and pulling install time and runtime container images in your premises. |
Istio |
Required |
Open-source service mesh that provides functionality such as ingress, request routing, traffic monitoring etc., for the microservices running inside the Kubernetes cluster. |
Prometheus |
Required |
Open-source system monitoring toolkit for Kubernetes. It can scrape or accept metrics from Kubernetes components as well as workloads running in the clusters and store those in time series database. |
Grafana |
Required |
Open-source visualization tool used for querying and visualizing data stored in Prometheus. You can create and ship a variety of dashboards for cluster and service monitoring. |
Alertmanager |
Required |
Open-source tool that helps handling alerts sent by client applications such as the Prometheus server. It is responsible for deduplicating, grouping, and routing them to the correct receiver integrations, such as email, PagerDuty, or OpsGenie. |
Redis |
Required |
Redis Enterprise non-HA (single shard) used by some UiPath® services to get centralized cache functionality. |
FluentD and Fluentbit |
Required |
Open-source reliable log scraping solution. The logging operator deploys and configures a background process on every node to collect container and application logs from the node file system. |
Gatekeeper |
Required |
Open-source tool that allows a Kubernetes administrator to implement policies for ensuring compliance and best practices in their cluster. |
Velero | Required1 |
Open-source tool that allows you to take a snapshot backup and restore. |
Thanos |
Required | Open-source tool to push the Prometheus matrix to an objectstore for persistence. |
1 Only installed during backup and restore.