automation-suite
2023.10
false
- Overview
- Requirements
- Recommended: Deployment templates
- Manual: Preparing the installation
- Manual: Preparing the installation
- Step 1: Configuring the OCI-compliant registry for offline installations
- Step 2: Configuring the external objectstore
- Step 3: Configuring High Availability Add-on
- Step 4: Configuring Microsoft SQL Server
- Step 5: Configuring the load balancer
- Step 6: Configuring the DNS
- Step 7: Configuring the disks
- Step 8: Configuring kernel and OS level settings
- Step 9: Configuring the node ports
- Step 10: Applying miscellaneous settings
- Step 12: Validating and installing the required RPM packages
- Step 13: Generating cluster_config.json
- Certificate configuration
- Database configuration
- External Objectstore configuration
- Pre-signed URL configuration
- External OCI-compliant registry configuration
- Disaster recovery: Active/Passive and Active/Active configurations
- High Availability Add-on configuration
- Orchestrator-specific configuration
- Insights-specific configuration
- Process Mining-specific configuration
- Document Understanding-specific configuration
- Automation Suite Robots-specific configuration
- Monitoring configuration
- Optional: Configuring the proxy server
- Optional: Enabling resilience to zonal failures in a multi-node HA-ready production cluster
- Optional: Passing custom resolv.conf
- Optional: Increasing fault tolerance
- install-uipath.sh parameters
- Adding a dedicated agent node with GPU support
- Adding a dedicated agent Node for Task Mining
- Connecting Task Mining application
- Adding a Dedicated Agent Node for Automation Suite Robots
- Step 15: Configuring the temporary Docker registry for offline installations
- Step 16: Validating the prerequisites for the installation
- Manual: Performing the installation
- Post-installation
- Cluster administration
- Managing products
- Getting Started with the Cluster Administration portal
- Migrating objectstore from persistent volume to raw disks
- Migrating from in-cluster to external High Availability Add-on
- Migrating data between objectstores
- Migrating in-cluster objectstore to external objectstore
- Migrating to an external OCI-compliant registry
- Switching to the secondary cluster manually in an Active/Passive setup
- Disaster Recovery: Performing post-installation operations
- Converting an existing installation to multi-site setup
- Guidelines on upgrading an Active/Passive or Active/Active deployment
- Guidelines on backing up and restoring an Active/Passive or Active/Active deployment
- Redirecting traffic for the unsupported services to the primary cluster
- Monitoring and alerting
- Migration and upgrade
- Step 1: Moving the Identity organization data from standalone to Automation Suite
- Step 2: Restoring the standalone product database
- Step 3: Backing up the platform database in Automation Suite
- Step 4: Merging organizations in Automation Suite
- Step 5: Updating the migrated product connection strings
- Step 6: Migrating standalone Orchestrator
- Step 7: Migrating standalone Insights
- Step 8: Deleting the default tenant
- B) Single tenant migration
- Migrating from Automation Suite on Linux to Automation Suite on EKS/AKS
- Upgrading Automation Suite
- Downloading the installation packages and getting all the files on the first server node
- Retrieving the latest applied configuration from the cluster
- Updating the cluster configuration
- Configuring the OCI-compliant registry for offline installations
- Executing the upgrade
- Performing post-upgrade operations
- Product-specific configuration
- Using the Orchestrator Configurator Tool
- Configuring Orchestrator parameters
- Orchestrator appSettings
- Configuring appSettings
- Configuring the maximum request size
- Overriding cluster-level storage configuration
- Configuring credential stores
- Configuring encryption key per tenant
- Cleaning up the Orchestrator database
- Best practices and maintenance
- Troubleshooting
- How to troubleshoot services during installation
- How to uninstall the cluster
- How to clean up offline artifacts to improve disk space
- How to clear Redis data
- How to enable Istio logging
- How to manually clean up logs
- How to clean up old logs stored in the sf-logs bucket
- How to disable streaming logs for AI Center
- How to debug failed Automation Suite installations
- How to delete images from the old installer after upgrade
- How to disable TX checksum offloading
- How to upgrade from Automation Suite 2022.10.10 and 2022.4.11 to 2023.10.2
- How to manually set the ArgoCD log level to Info
- How to expand AI Center storage
- How to generate the encoded pull_secret_value for external registries
- How to address weak ciphers in TLS 1.2
- How to forward application logs to Splunk
- Unable to run an offline installation on RHEL 8.4 OS
- Error in downloading the bundle
- Offline installation fails because of missing binary
- Certificate issue in offline installation
- First installation fails during Longhorn setup
- SQL connection string validation error
- Prerequisite check for selinux iscsid module fails
- Azure disk not marked as SSD
- Failure after certificate update
- Antivirus causes installation issues
- Automation Suite not working after OS upgrade
- Automation Suite requires backlog_wait_time to be set to 0
- Volume unable to mount due to not being ready for workloads
- Support bundle log collection failure
- Test Automation SQL connection string is ignored
- Data loss when reinstalling or upgrading Insights following Automation Suite upgrade
- Single-node upgrade fails at the fabric stage
- Cluster unhealthy after automated upgrade from 2021.10
- Upgrade fails due to unhealthy Ceph
- RKE2 not getting started due to space issue
- Volume unable to mount and remains in attach/detach loop state
- Upgrade fails due to classic objects in the Orchestrator database
- Ceph cluster found in a degraded state after side-by-side upgrade
- Unhealthy Insights component causes the migration to fail
- Service upgrade fails for Apps
- In-place upgrade timeouts
- Docker registry migration stuck in PVC deletion stage
- AI Center provisioning failure after upgrading to 2023.10 or later
- Upgrade fails in offline environments
- SQL validation fails during upgrade
- snapshot-controller-crds pod in CrashLoopBackOff state after upgrade
- Longhorn REST API endpoint upgrade/reinstall error
- Upgrade fails due to overridden Insights PVC sizes
- Setting a timeout interval for the management portals
- Authentication not working after migration
- Kinit: Cannot find KDC for realm <AD Domain> while getting initial credentials
- Kinit: Keytab contains no suitable keys for *** while getting initial credentials
- GSSAPI operation failed due to invalid status code
- Alarm received for failed Kerberos-tgt-update job
- SSPI provider: Server not found in Kerberos database
- Login failed for AD user due to disabled account
- ArgoCD login failed
- Update the underlying directory connections
- Failure to get the sandbox image
- Pods not showing in ArgoCD UI
- Redis probe failure
- RKE2 server fails to start
- Secret not found in UiPath namespace
- ArgoCD goes into progressing state after first installation
- MongoDB pods in CrashLoopBackOff or pending PVC provisioning after deletion
- Unhealthy services after cluster restore or rollback
- Pods stuck in Init:0/X
- Missing Ceph-rook metrics from monitoring dashboards
- Pods cannot communicate with FQDN in a proxy environment
- Failure to configure email alerts post upgrade
- Document Understanding not on the left rail of Automation Suite
- Failed status when creating a data labeling session
- Failed status when trying to deploy an ML skill
- Migration job fails in ArgoCD
- Handwriting recognition with intelligent form extractor not working
- Failed ML skill deployment due to token expiry
- Running High Availability with Process Mining
- Process Mining ingestion failed when logged in using Kerberos
- After Disaster Recovery Dapr is not working properly for Process Mining
- Unable to connect to AutomationSuite_ProcessMining_Warehouse database using a pyodbc format connection string
- Airflow installation fails with sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string ''
- How to add an IP table rule to use SQL Server port 1433
- Task Mining troubleshooting
- Running the diagnostics tool
- Using the Automation Suite support bundle
- Exploring Logs
AWS deployment architecture

Automation Suite on Linux Installation Guide
Last updated Feb 13, 2025
AWS deployment architecture
-
Uipath-sf:
- SSL stack
- Routing stack
- Server stack
- Database stack
- Backup stack
- Management stack
- Lambda functions (
AWS::Lambda::Function
):FindAMIFunction
– for finding a matching AMI Id.CreateInputJsonFunction
– for creating the configuration used by the Automation Suite installer.ComputeResourceSizeFunction
– for computing the minimum EC2 instances hardware configuration needed, based on the selected services and deployment type.
- IAM roles (
AWS::IAM::Role
) for the Lambda functions to provide minimum permissions:FindAmiLambdaRole
CreateInputJsonLambdaRole
ComputeResourceSizeLambdaRole
- Secrets (
AWS::SecretsManager::Secret
) to store sensitive information:RDSPassword
OrgSecret
PlatformSecret
ArgoCdSecret
ArgoCdUserSecret
InputJsonSecret
KubeconfigSecret
- SSL Stack (optional)
- Network stack (optional)
-
Backup stack (optional):
ClusterBackupStorage
(AWS::EFS::FileSystem
) – Amazon Elastic File System used to store the backup.SharedStorageSecurityGroup
(AWS::EC2::SecurityGroup
) – Security group used to allow NFS network connections from the cluster nodes.SharedStorageMountTargetOne
(AWS::EFS::MountTarget
) – Resource that creates the mount target for the EFS file system and the first private subnet.SharedStorageMountTargetTwo
(AWS::EFS::MountTarget
) – Resource that creates the mount target for the EFS file system and the second private subnet.SharedStorageMountTargetThree
(AWS::EFS::MountTarget
) – Optional resource that creates the mount target for the EFS file system and the third private subnet.
- Database stack:
RDSDBInstance
(AWS::RDS::DBInstance
) – The Amazon RDS DB instance. The DB SKU isdb.m5.2xlarge
.DBSubnetGroup
(AWS::RDS::DBSubnetGroup
) – Private subnet group that contains the private subnets.DbSecurityGroup
(AWS::EC2::SecurityGroup
) – Security Group allowing access to the DB instance.PMRDSDBInstance
(AWS::RDS::DBInstance
) – Dedicated Amazon RDS DB instance for Process Mining. Only deployed when Process Mining is enabled and the deployment isMulti Node
. The DB SKU isdb.m5.4xlarge
.
- Routing stack:NOTE: The Alb and Nlb stacks are mutually exclusive configurations
- Alb stack:
ExternalLoadBalancer
(AWS::ElasticLoadBalancingV2::LoadBalancer
) – Application load balancer used to distribute Automation Suite traffic. It can be internal or internet-facing.ELBSecurityGroup
(AWS::EC2::SecurityGroup
) – The security group applied to the load balancer.HttpsTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group of the load balancer.HttpsListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the load balancer.
- Nlb stack:
ExternalLoadBalancer
(AWS::ElasticLoadBalancingV2::LoadBalancer
) – Network load balancer used to distribute Automation Suite traffic. It can be internal or internet-facing.TcpTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group of the load balancer.TcpListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the load balancer.
KubeLoadBalancer
(AWS::ElasticLoadBalancingV2::LoadBalancer
) – Private network load balancer used for node registration.KubeApiTcpTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group for the node registration traffic of theKubeLoadBalancer
.KubeApiTcpListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the node registration traffic of theKubeLoadBalancer
.Rke2RegistrationTcpTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group for the node registration traffic of theKubeLoadBalancer
.Rke2RegistrationTcpListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the node registration traffic of theKubeLoadBalancer
.RootRecordSet
(AWS::Route53::RecordSet
) – DNS A record for the FQDN.SubdomainRecordSet
(AWS::Route53::RecordSet
) – DNS A record for the subdomains of the FQDN.
- Alb stack:
- Management stack:
LifecycleAutomationLogs
(AWS::Logs::LogGroup
) – Log group for logging events from the SSM automation.ClusterOperationsAutomationLogs
– Log group for logging events related to cluster operations.OnDemandRestoreStateMachine
(AWS::StepFunctions::StateMachine
) – Step function used to orchestrate the restore flow.- SSM Documents (
AWS::SSM::Document
) sets of steps used to provide graceful node removal:ServerRemoveInstanceDocument
AgentRemoveInstanceDocument
UpdateAMIDocument
– Updates the AMI ID for the Auto Scaling Groups.RegisterAiCenter
– Registers AI Center to an external Orchestrator provided at deployment time.OnDemandBackup
– Creates a manual snapshot of the Automation Suite cluster.GetBackupList
– Retrieves all available snapshots for the Automation Suite cluster.OnDemandRestoreDocument
– Restores the Automation Suite cluster from a given snapshot.
- Autoscaling Lyfecycle hooks (
AWS::AutoScaling::LifecycleHook
) that allow us to run the SSM documents when an EC2 instance receives an instance termination event:ServerAsgLifeCycleHookTerminating
AgentAsgLifeCycleHookTerminating
AsRobotsAsgLifeCycleHookTerminating
- Event rules (
AWS::Events::Rule
) that trigger the execution of the SSM Documents:ServerTerminateEventRule
AgentTerminateEventRule
AsRobotsTerminateEventRule
- IAM roles (
AWS::IAM::Role
) needed for running SSM Documents and adding logs to the Log Group:AutomationAssumeRole
EventsBridgeAssumeRole
StateMachinesAssumeRole
Note:AutomationAssumeRole
andStateMachinesAssumeRole
allow full access to Amazon SSN. For more information, see AmazonSSMFullAccess. - Server stack:
ServerLaunchConfiguration
(AWS::EC2::LaunchTemplate
) – EC2 instance configuration for the server nodes. Disk configuration:- OS disk – sku gp3, capacity 256GiB
- Cluster disk – sku gp3, capacity 300GiB
- etcd disk – sku io1, capacity 32GiB
- Data disk – sku gp3, capacity 512GiB regardless of the selected services.
- Objectstore disk – sku gp3, capacity 512GiB
- Optional disk for Automation Suite Robots package caching – sku gp3, capacity 32GiB. The disk is deployed only if the the Automation Suite Robots service is enabled in a single-node deployment
AgentLaunchConfiguration
(AWS::EC2::LaunchTemplate
) – EC2 instance configuration for the agent nodes. Disk configuration:- OS disk – sku gp3, capacity 128GiB
- Cluster disk – sku gp3, capacity 256GiB
ASRobotsLaunchTemplate
(AWS::EC2::LaunchTemplate
) – EC2 instance configuration for the ASRobots nodes. Disk configuration:- OS disk – sku gp3, capacity 128GiB
- Cluster disk – sku gp3, capacity 256GiB
- Robot package caching disk - sku gp3, capacity 32GiB
GpuEnabledNode
(AWS::EC2::Instance
) – Optional GPU node. It has the same disk configuration as an agent.TaskMiningNode
(AWS::EC2::Instance
) - Optional Task Mining node. Deployed only if the Task Mining service is selected. It has the same disk configuration as an agent.BastionHost
(AWS::EC2::Instance
) – Optional EC2 instance used to SSH to cluster nodes. It has thet3.large
instance type and a 200GiB gp3 disk.ServerAutoScalingGroup
(AWS::AutoScaling::AutoScalingGroup
) – Auto scaling group for the servers.AgentAutoScalingGroup
(AWS::AutoScaling::AutoScalingGroup
) – Auto scaling group for the agents.ASRobotsAutoScalingGroup
– Auto scaling Group for dedicated Automation Suite Robots nodes. The capacity of this scaling group is 1 if the deployment isMulti Node
and the Automation Suite Robots service is enabled, and 0 otherwise.- Optional
ServiceFabricIamRole
(AWS::IAM::Role
) that has permissions to:- write logs
- read EC2 instances configurations
- download AWS Quickstart resources
- access the Automation Suite installation configuration secret
- access the cluster kubeconfig configuration secret
ServiceFabricSecurityGroup
(AWS::EC2::SecurityGroup
) – Security Group allowing access to UiPath® applications.BastionSecurityGroup
(AWS::EC2::SecurityGroup
) – Optional Security Group allowing SSH access to BastionAsgProcessModifierFunction
(AWS::Lambda::Function
) – Used to modify the ASG processes during CF stack creation.AsgProcessModificationRole
(AWS::IAM::Role
) – IAM role to provide minimum permissions for theAsgProcessModifierFunction
- SSM parameters (
AWS::SSM::Parameter
):InstanceAMIIdSSMParameter
– Stores the AMI ID of the nodes.InstanceAMIImageNameSSMParameter
– Holds the Image Name used at deployment time or updated via theUpdateAMIDocument
.
- Autoscaling Lyfecycle hooks (
AWS::AutoScaling::LifecycleHook
) that allow us to transition EC2 instances to InService state after the installer succeeded:ServerAsgLifeCycleHookLaunching
AgentAsgLifeCycleHookLaunching
ASRobotsAsgLifeCycleHookLaunching
The template dynamically computes the hardware needed for the deployment as follows:
- Depending on the services installed, it sets minimum requirements at cluster level.
- Depending on the deployment profile (multi-node or single-node profile), it sets minimum requirements for a single VM.
- Selects the instance types based on their availability in the region you deploy and the aforementioned requirements.
The following table shows the mappings between deployment and possible instance types:
Deployment type |
Instance types |
---|---|
Single-node, services selection that needs less than 16 CPUs |
c5.4xlarge , c5a.4xlarge , m5.4xlarge , m5a.4xlarge |
Single-node, services selection that needs more than 16 CPUs |
c5a.8xlarge , c5.9xlarge , m5.8xlarge |
Multi-node, services selection that needs less than 48 CPUs |
c5.4xlarge , c5a.4xlarge , m5.4xlarge , m4.4xlarge |
Multi-node, services selection that needs more than 48 CPUs |
c5a.8xlarge , c5.9xlarge , m5.8xlarge , m5a.8xlarge |