automation-suite
2022.4
false
- Overview
- Requirements
- Installation
- Post-installation
- Cluster administration
- Managing products
- Managing the cluster in ArgoCD
- Setting up the external NFS server
- Automated: Enabling the Backup on the Cluster
- Automated: Disabling the Backup on the Cluster
- Automated, Online: Restoring the Cluster
- Automated, Offline: Restoring the Cluster
- Manual: Enabling the Backup on the Cluster
- Manual: Disabling the Backup on the Cluster
- Manual, Online: Restoring the Cluster
- Manual, Offline: Restoring the Cluster
- Additional configuration
- Migrating objectstore from persistent volume to raw disks
- Monitoring and alerting
- Migration and upgrade
- Migration options
- Step 1: Moving the Identity organization data from standalone to Automation Suite
- Step 2: Restoring the standalone product database
- Step 3: Backing up the platform database in Automation Suite
- Step 4: Merging organizations in Automation Suite
- Step 5: Updating the migrated product connection strings
- Step 6: Migrating standalone Insights
- Step 7: Deleting the default tenant
- B) Single tenant migration
- Product-specific configuration
- Best practices and maintenance
- Troubleshooting
- How to Troubleshoot Services During Installation
- How to Uninstall the Cluster
- How to clean up offline artifacts to improve disk space
- How to clear Redis data
- How to enable Istio logging
- How to manually clean up logs
- How to clean up old logs stored in the sf-logs bucket
- How to disable streaming logs for AI Center
- How to debug failed Automation Suite installations
- How to delete images from the old installer after upgrade
- How to automatically clean up Longhorn snapshots
- How to disable TX checksum offloading
- How to address weak ciphers in TLS 1.2
- Unable to run an offline installation on RHEL 8.4 OS
- Error in Downloading the Bundle
- Offline installation fails because of missing binary
- Certificate issue in offline installation
- First installation fails during Longhorn setup
- SQL connection string validation error
- Prerequisite check for selinux iscsid module fails
- Azure disk not marked as SSD
- Failure After Certificate Update
- Automation Suite not working after OS upgrade
- Automation Suite Requires Backlog_wait_time to Be Set 1
- Volume unable to mount due to not being ready for workloads
- RKE2 fails during installation and upgrade
- Failure to upload or download data in objectstore
- PVC resize does not heal Ceph
- Failure to Resize Objectstore PVC
- Rook Ceph or Looker pod stuck in Init state
- StatefulSet volume attachment error
- Failure to create persistent volumes
- Storage reclamation patch
- Backup failed due to TooManySnapshots error
- All Longhorn replicas are faulted
- Setting a timeout interval for the management portals
- Update the underlying directory connections
- Cannot Log in After Migration
- Kinit: Cannot Find KDC for Realm <AD Domain> While Getting Initial Credentials
- Kinit: Keytab Contains No Suitable Keys for *** While Getting Initial Credentials
- GSSAPI Operation Failed With Error: An Invalid Status Code Was Supplied (Client's Credentials Have Been Revoked).
- Alarm Received for Failed Kerberos-tgt-update Job
- SSPI Provider: Server Not Found in Kerberos Database
- Login Failed for User <ADDOMAIN><aduser>. Reason: The Account Is Disabled.
- ArgoCD login failed
- Failure to get the sandbox image
- Pods not showing in ArgoCD UI
- Redis Probe Failure
- RKE2 Server Fails to Start
- Secret Not Found in UiPath Namespace
- After the Initial Install, ArgoCD App Went Into Progressing State
- MongoDB pods in CrashLoopBackOff or pending PVC provisioning after deletion
- Unexpected Inconsistency; Run Fsck Manually
- Degraded MongoDB or Business Applications After Cluster Restore
- Missing Self-heal-operator and Sf-k8-utils Repo
- Unhealthy Services After Cluster Restore or Rollback
- RabbitMQ pod stuck in CrashLoopBackOff
- Prometheus in CrashloopBackoff state with out-of-memory (OOM) error
- Missing Ceph-rook metrics from monitoring dashboards
- Pods cannot communicate with FQDN in a proxy environment
- Using the Automation Suite Diagnostics Tool
- Using the Automation Suite support bundle
- Exploring Logs
AWS deployment architecture
Automation Suite Installation Guide
Last updated Dec 19, 2024
AWS deployment architecture
-
Main stack – principal entry point:
- Network stack
- Uipath-sf stack
-
In-depth configurable stack:
- Network stack
- Uipath-sf stack
- Uipath-sf stack
-
Uipath-sf:
- SSL stack
- Routing stack
- Server stack
- Database stack
- Backup stack
- Management stack
-
Lambda functions (
AWS::Lambda::Function
):FindAMIFunction
– for finding a matching AMI Id.CreateInputJsonFunction
– for creating the configuration used by the Automation Suite installer.ComputeResourceSizeFunction
– for computing the minimum EC2 instances hardware configuration needed, based on the selected services and deployment type.
-
IAM roles (
AWS::IAM::Role
) for the Lamdda functions to provide minimum permissions:FindAmiLambdaRole
CreateInputJsonLambdaRole
ComputeResourceSizeLambdaRole
-
Secrets (
AWS::SecretsManager::Secret
) to store sensitive information:RDSPassword
OrgSecret
PlatformSecret
ArgoCdSecret
ArgoCdUserSecret
InputJsonSecret
KubeconfigSecret
- SSL Stack (optional)
- Network stack (optional)
-
Backup stack (optional):
ClusterBackupStorage
(AWS::EFS::FileSystem
) – Amazon Elastic File System used to store the backup.SharedStorageSecurityGroup
(AWS::EC2::SecurityGroup
) – Security group used to allow NFS network connections from the cluster nodes.SharedStorageMountTargetOne
(AWS::EFS::MountTarget
) – Resource that creates the mount target for the EFS file system and the first private subnet.SharedStorageMountTargetTwo
(AWS::EFS::MountTarget
) – Resource that creates the mount target for the EFS file system and the second private subnet.SharedStorageMountTargetThree
(AWS::EFS::MountTarget
) – Optional resource that creates the mount target for the EFS file system and the third private subnet.
-
Database stack:
RDSDBInstance
(AWS::RDS::DBInstance
) – The Amazon RDS DB instance. The DB SKU isdb.m5.2xlarge
.DBSubnetGroup
(AWS::RDS::DBSubnetGroup
) – Private subnet group that contains the private subnets.DbSecurityGroup
(AWS::EC2::SecurityGroup
) – Security Group allowing access to the DB instance.
-
Routing stack:NOTE: The Alb and Nlb stacks are mutually exclusive configurations
-
Alb stack:
ExternalLoadBalancer
(AWS::ElasticLoadBalancingV2::LoadBalancer
) – Application load balancer used to distribute Automation Suite traffic. It can be internal or internet-facing.ELBSecurityGroup
(AWS::EC2::SecurityGroup
) – The security group applied to the load balancer.HttpsTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group of the load balancer.HttpsListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the load balancer.
-
Nlb stack:
ExternalLoadBalancer
(AWS::ElasticLoadBalancingV2::LoadBalancer
) – Network load balancer used to distribute Automation Suite traffic. It can be internal or internet-facing.TcpTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group of the load balancer.TcpListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the load balancer.
KubeLoadBalancer
(AWS::ElasticLoadBalancingV2::LoadBalancer
) – Private network load balancer used for node registration.KubeApiTcpTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group for the node registration traffic of theKubeLoadBalancer
.KubeApiTcpListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the node registration traffic of theKubeLoadBalancer
.Rke2RegistrationTcpTargetGroup
(AWS::ElasticLoadBalancingV2::TargetGroup
) – The target group for the node registration traffic of theKubeLoadBalancer
.Rke2RegistrationTcpListener
(AWS::ElasticLoadBalancingV2::Listener
) – The listener for the node registration traffic of theKubeLoadBalancer
.RootRecordSet
(AWS::Route53::RecordSet
) – DNS A record for the FQDN.SubdomainRecordSet
(AWS::Route53::RecordSet
) – DNS A record for the subdomains of the FQDN.
-
-
Management stack:
LifecycleAutomationLogs
(AWS::Logs::LogGroup
) – Log group for logging events from the SSM automation.-
SSM Documents (
AWS::SSM::Document
) sets of steps used to provide graceful node removal:ServerRemoveInstanceDocument
AgentRemoveInstanceDocument
UpdateAMIDocument
– Updates the AMI ID for the Auto Scaling Groups.
-
Autoscaling Lyfecycle hooks (
AWS::AutoScaling::LifecycleHook
) that allow us to run the SSM documents when an EC2 instance receives an instance termination event:ServerAsgLifeCycleHookTerminating
AgentAsgLifeCycleHookTerminating
-
Event rules (
AWS::Events::Rule
) that trigger the execution of the SSM Documents:ServerTerminateEventRule
AgentTerminateEventRule
-
IAM roles (
AWS::IAM::Role
) needed for running SSM Documents and adding logs to the Log Group:AutomationAssumeRole
EventsBridgeAssumeRole
- Server stack:
-
ServerLaunchConfiguration
(AWS::EC2::LaunchTemplate
) – EC2 instance configuration for the server nodes. Disk configuration:- OS disk – sku gp2, capacity 128GiB
- Cluster disk – sku gp2, capacity 300GiB
- etcd disk – sku io1, capacity 32GiB
- Data disk – sku gp2, capacity 512GiB or 2TiB depending on the selected services.
-
AgentLaunchConfiguration
(AWS::EC2::LaunchTemplate
) – EC2 instance configuration for the agent nodes. Disk configuration:- OS disk – sku gp2, capacity 128GiB
- Cluster disk – sku gp2, capacity 300GiB
GpuEnabledNode
(AWS::EC2::Instance
) – Optional GPU node. It has the same disk configuration as an agent.TaskMiningNode
(AWS::EC2::Instance
) - Optional Task Mining node. Deployed only if the Task Mining service is selected. It has the same disk configuration as an agent.BastionHost
(AWS::EC2::Instance
) – Optional EC2 instance used to SSH to cluster nodes. It has thet3.large
instance type and a 200GiB gp2 disk.ServerAutoScalingGroup
(AWS::AutoScaling::AutoScalingGroup
) – Auto scaling group for the servers.AgentAutoScalingGroup
(AWS::AutoScaling::AutoScalingGroup
) – Auto scaling group for the agents-
Optional
ServiceFabricIamRole
(AWS::IAM::Role
) that has permissions to:- write logs
- read EC2 instances configurations
- download AWS Quickstart resources
- access the Automation Suite installation configuration secret
- access the cluster kubeconfig configuration secret
ServiceFabricSecurityGroup
(AWS::EC2::SecurityGroup
) – Security Group allowing access to UiPath applications.BastionSecurityGroup
(AWS::EC2::SecurityGroup
) – Optional Security Group allowing SSH access to BastionAsgProcessModifierFunction
(AWS::Lambda::Function
) – Used to modify the ASG processes during CF stack creation.AsgProcessModificationRole
(AWS::IAM::Role
) – IAM role to provide minimum permissions for theAsgProcessModifierFunction
-
SSM parameters (
AWS::SSM::Parameter
):InstanceAMIIdSSMParameter
– Stores the AMI ID of the nodes.InstanceAMIImageNameSSMParameter
– Holds the Image Name used at deployment time or updated via theUpdateAMIDocument
.
-
Autoscaling Lyfecycle hooks (
AWS::AutoScaling::LifecycleHook
) that allow us to transition EC2 instances to InService state after the installer succeeded:ServerAsgLifeCycleHookLaunching
AgentAsgLifeCycleHookLaunching
The template dynamically computes the hardware needed for the deployment as follows:
- Depending on the services installed (basic or complete product selection), it sets minimum requirements at cluster level.
- Depending on the deployment profile (multi-node or single-node profile), it sets minimum requirements for a single VM.
- Selects the instance types based on their availability in the region you deploy and the aforementioned requirements.
The following table shows the mappings between deployment and possible instance types:
Deployment type |
Instance types |
---|---|
Basic single-node |
c5.4xlarge , c5a.4xlarge , m5.4xlarge , m5a.4xlarge |
Complete single-node |
c5a.8xlarge , c5.9xlarge , m5.8xlarge |
Basic multi-node |
c5.4xlarge , c5a.4xlarge , m5.4xlarge , m4.4xlarge |
Complete multi-node |
c5a.8xlarge , c5.9xlarge , m5.8xlarge , m5a.8xlarge |