AI Center

DELIVERY:

Automation Cloud Automation Suite Standalone

Last updated Jun 6, 2024

General AI Center Troubleshooting and FAQs

Issue: Provisioning Job Blocked in Connection Checking in Progress

The provisioning job may get stuck in Connection checking in progress.

Solution

To fix this issue, follow the steps below:

Quit the logs and check the status of the conn-checker pod kubectl get pods.
If you see Invalid Image Name displayed, try describing the pod: kubectl describe <conn-checked-pod-name>.
If the Failed to apply default image.. message is displayed in Events (bottom), this can mean that airgapped and non-airgapped are mixed:
1. Check that infra and application are installed on the same pod.
2. Check that the license is the same. Check the airgapped field in the yaml file to see if it is true or not and if this is expected.
If the issue is with the license, this needs to be changed from the backend. Contact the person who provided the license and ask them to change it, or the AI Center team.

Issue: Host Admin Page Errors

In case of host admin page errors (tenant provision error), use the solution below.

Solution

Make sure that the system time on the Orchestrator and AI Center VMs is in sync, including daylight saving time. The token provided by Identity Server can be an hour in the future if the system time is not synchronized.

Message: Kubectl Get Pods -A | Grep Evicted

If there are a lot of evicted pods due to the error message above, this can slow down the machine or cause network issues. To solve this, use the solution below.

Solution

To solve this issue, run the following script or a similar one:

IFS=$'\)
'
for line in $(kubectl get pods -A | awk {'printf "%s,%s,%s\)
", $1,$2,$4'} | grep -E "Evicted"); do 
  ns=$(echo $line | cut -d',' -f1)
  pod=$(echo $line | cut -d',' -f2)
  kubectl delete pod -n $ns $pod
doneIFS=$'\)
'
for line in $(kubectl get pods -A | awk {'printf "%s,%s,%s\)
", $1,$2,$4'} | grep -E "Evicted"); do 
  ns=$(echo $line | cut -d',' -f1)
  pod=$(echo $line | cut -d',' -f2)
  kubectl delete pod -n $ns $pod
done

Issue Regarding ML Skills During Prediction

If you want to monitor the progress of a pod while being called, you need to identify the pod corresponding to the skill and then connect to the Linux machine in order to check the logs while doing a prediction. For the most efficient way to do this, check the Solution section below.

Solution

For the most efficient way to identify a pod corresponding to a skill, follow the steps below.

Go to the AI Center application.
Go to the ML Skill page.
Open network calls while inspecting page.
Refresh the grid to get the ML Skill.
Find the ML Skill call and preview it.
Find the right ML Skill in the list and search for tenant-id and id. Tenant id is the namespace and is the pod name.
Once you have the above information, check running logs by using the following command:
```
kubectl -n <tenant-id> logs -f <id>kubectl -n <tenant-id> logs -f <id>
```

You can now call the skill and see the process in real time.

Issue While Pipeline Is Running

A pipeline failure takes place due to a file upload failure with an error message similar to the one below:

2021-04-30 20:59:43,397 - uipath_core.storage.local_storage_client:upload:132 - ERROR:  Failed to upload file: logs/f5f7b9f4-0813-4107-a269-bf65de12444f/train.log.20210430205938 to bucket: training-8319b955-6187-43c3-a46f-612a9ea6f523, error: can't start new thread
2021-04-30 20:59:48,401 - uipath_core.utils.utils:_retries:172 - WARNING:  Function: upload execution failed, retry count 12021-04-30 20:59:43,397 - uipath_core.storage.local_storage_client:upload:132 - ERROR:  Failed to upload file: logs/f5f7b9f4-0813-4107-a269-bf65de12444f/train.log.20210430205938 to bucket: training-8319b955-6187-43c3-a46f-612a9ea6f523, error: can't start new thread
2021-04-30 20:59:48,401 - uipath_core.utils.utils:_retries:172 - WARNING:  Function: upload execution failed, retry count 1

Solution

Upgrade to a newer AI Center version (2021.4, for example) where this issue is fixed.

If an upgrade is not a solution for the moment, delete the logs in the training pod using the following command:

kubectl -n <namespace> exec -it <pod_id> -- sh -c 'rm -rf /microservice/trainer_run_logs'kubectl -n <namespace> exec -it <pod_id> -- sh -c 'rm -rf /microservice/trainer_run_logs'

In the command above, the following variables are used:

namespace - namespace of the pod. This can be obtained by running the kubectl get namespaces command. Training namespaces start with training-.
pod_id - pod id of the training pod. This can be obtained by running the kubectl get pod in the training space above.