Automation Suite
2021.10
false
Banner background image
Automation Suite Installation Guide
Last updated Apr 19, 2024

Using the Automation Suite Diagnostics Tool

The Automation Suite Diagnostics Tool is the first thing to use when facing any issues with Automation Suite. It checks the health of different required components and gives a consolidated report.

Tip:
Download the supportability-tools zip and extract its contents using the following commands:

curl "https://download.uipath.com/automation-suite/2021.10.3/supportability-tools-2021.10.3.zip" -o supportability-tools-2021.10.3.zip

unzip supportability-tools-2021.10.3.zip -d support-tools

Then, you can run the Automation Suite Diagnostics Tool from the support-tools/diagnostics-tool/ folder using the bash diagnostics-report.sh command.

The following table lists out the checks the Automation Suite Diagnostics Tool performs. Note that you can run the script on any of the nodes in the cluster as well as externally.

Node

Checks

Master node

  • Checks if required services are running;
  • Tests if disk sizes are properly configured;
  • Runs a Kubernetes job that collects data on the health of other services;

Agent node

  • Checks if required services are running on the node
  • Tests if disk sizes are properly configured;

External machine

  • Runs a Kubernetes job to collect the health of the services.
Note: To run the script from an external machine, first set the proper kubeconfig context to the cluster, and then pass the -e flag to the script bash diagnostics-report.sh -e.

Sample report generated by the Automation Suite Diagnostics Tool.



Reading Diagnostics Reports

INFO Logs

INFO logs in green show that the required checks passed. However, you should still properly check the disk/memory usage to avoid hidden errors.

WARN Messages

Even though these messages do not signal a high risk, you might have to rectify them, as they might be affecting some services in certain scenarios.

ERROR Messages

You must fix the issues described by these messages as they impact some service in the cluster.

Rke2-server or Rke2-agent Service Down

If these services are down, it means the node is down. Try restarting the service using systemctl restart <service-name> as this should fix the issue.

Directory Size Mounted at /var/lib

The report displays the directory size mounted at /var/lib as Kubernetes uses it to store its data. If the directory is full, various issues might arise. To prevent these problems, make sure to increase its size.

Rke2 Version

The report displays the rke2 version for reference.

Disk Pressure or Memory Pressure

For all the nodes, we specify if they are under Disk Pressure or Memory Pressure. If that happens, workloads on these nodes might start showing issues. Check if there are any other processes running on these nodes that are consuming resources and remove them if that is the case.

Ceph Services Status

We use Ceph as S3 Object storage for storing logs and files from different applications. You can view the status of its services. If they are down, you might have to restart them. Make sure to also check if the disk usage by Ceph is full.

Ports 443 and 31443

We expect ports 443 and 31443 to be open with the hostname that was provided. The report indicates if they are not accessible. Make sure to open the appropriate ports if pointed here.

Certificate Validity

The tool checks if the uploaded certificate is valid for the given hostname and if it has not expired. If the certificate does not meet these criteria, errors occur. To prevent this, make sure to check your uploaded certificate and change it if required.

GPU

Since some services require GPU to be present on some of the nodes in the cluster, the Automation Suite Diagnostics Tool checks if there is are GPU nodes and prints number of such nodes. If you are expecting GPU nodes to be present and they do not show up here, that means something went wrong in GPU setup.

MongoDB

MongoDB is an important component that the UiPath Apps service uses. If either MongoDB or its primary instance is down, you need to investigate the issue using the support bundle.

RabbitMQ and DockerRegistry

RabbitMQ and DockerRegistry are two important components that some services use. If any of them is down, you need to investigate the issue and a restart.

ArgoCD Services Down

ArgoCD is our application lifecycle management (ALM) tool. If any of its services are down, then other applications may become outdated or have other issues. Recovering these services is important, and might need further debugging.

Missing or Degraded ArgoCD Applications

The Automation Suite Diagnostics Tool shows whether ArgoCD applications are missing and degraded.

  • If applications are missing, go to the ArgoCD UI and sync it.
  • If applications are degraded, additional debugging is needed to investigate the errors thrown by ArgoCD

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.