Subscribe

UiPath Automation Suite

The UiPath Automation Suite Guide

Using the Automation Suite Diagnostics Tool

This article explains the what and the how of the Automation Suite Diagnostics Tool.

Overview


The Automation Suite Diagnostics Tool is the first thing to use when facing any issues with Automation Suite. It checks the health of different required components and gives a consolidated report.

📘

You can get the Automation Suite Diagnostics Tool in the following ways:

  1. By unzipping the sf-installer.zip installer package.
  2. By downloading the supportability-tools.zip

Before running the Automation Suite Diagnostics Tool, navigate to the installer folder. You may find the installer in the following location or anywhere you downloaded it:

cd /opt/UiPathAutomationSuite/{version}/installer

To start using the Automation Suite Diagnostics Tool, run the following command:

./Support-Tools/diagnostics-tool/diagnostics-report.sh

The following table lists the checks the Automation Suite Diagnostics Tool performs. Note that you can run the script on any of the nodes in the cluster as well as externally.

NodeChecks
Master node Checks if required services are running;
Tests if disk sizes are properly configured;
Runs a Kubernetes job that collects data on the health of other services;
Agent node Checks if required services are running on the node
Tests if disk sizes are properly configured;
External machine Runs a Kubernetes job to collect the health of the services.

Note: To run the script from an external machine, first set the proper kubeconfig context to the cluster, and then pass the -e flag to the script bash diagnostics-report.sh -e.
Click to see a sample report generated by the Automation Suite Diagnostics Tool. 19241924

 

Reading diagnostics reports


INFO logs

INFO logs in green show that the required checks passed. However, you should still properly check the disk/memory usage to avoid hidden errors.

WARN messages

Even though these messages do not signal a high risk, you might have to rectify them, as they might be affecting some services in certain scenarios.

ERROR messages

You must fix the issues described by these messages as they impact some service in the cluster.

rke2-server or rke2-agent service down

If these services are down, it means the node is down. Try restarting the service using systemctl restart as this should fix the issue.

Directory size mounted at /var/lib

The report displays the directory size mounted at /var/lib as Kubernetes uses it to store its data. If the directory is full, various issues might arise. To prevent these problems, make sure to increase its size.

rke2 version

The report displays the rke2 version for reference.

Disk Pressure or Memory Pressure

For all the nodes, we specify if they are under Disk Pressure or Memory Pressure. If that happens, workloads on these nodes might start showing issues. Check if there are any other processes running on these nodes that are consuming resources and remove them if that is the case.

Ceph services status

We use Ceph as S3 Object storage for storing logs and files from different applications. You can view the status of its services. If they are down, you might have to restart them. Make sure to also check if the disk usage by Ceph is full.

Ports 443 and 31443

We expect ports 443 and 31443 to be open with the hostname that was provided. The report indicates if they are not accessible. Make sure to open the appropriate ports if pointed here.

Certificate validity

The tool checks if the uploaded certificate is valid for the given hostname and if it has not expired. If the certificate does not meet these criteria, errors occur. To prevent this, make sure to check your uploaded certificate and change it if required.

GPU

Since some services require GPU to be present on some of the nodes in the cluster, the Automation Suite Diagnostics Tool checks if there is are GPU nodes and prints number of such nodes. If you are expecting GPU nodes to be present and they do not show up here, that means something went wrong in GPU setup.

MongoDB

MongoDB is an important component that the UiPath Apps service uses. If either MongoDB or its primary instance is down, you need to investigate the issue using the support bundle.

RabbitMQ and DockerRegistry

RabbitMQ and DockerRegistry are two important components that some services use. If any of them is down, you need to investigate the issue and a restart.

ArgoCD services down

ArgoCD is our application lifecycle management (ALM) tool. If any of its services are down, then other applications may become outdated or have other issues. Recovering these services is important, and might need further debugging.

Missing or degraded ArgoCD Applications

The Automation Suite Diagnostics Tool shows whether ArgoCD applications are missing and degraded.

  • If applications are missing, go to the ArgoCD UI and sync it.
  • If applications are degraded, additional debugging is needed to investigate the errors thrown by ArgoCD

Updated 24 days ago


Using the Automation Suite Diagnostics Tool


This article explains the what and the how of the Automation Suite Diagnostics Tool.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.