automation-suite
2023.10
true
Automation Suite on EKS/AKS Installation Guide
Last updated Oct 4, 2024

Troubleshooting

Health check of Automation Suite robots fails

Description

After installing Automation Suite on AKS, when you check the health status of the Automation Suite robots pod, it returns an unhealthy status: "[POD_UNHEALTHY] Pod asrobots-migrations-cvzfn in namespace uipath is in Failed status".

Potential issue

On rare occassions, database migrations for Orchestrator and Automation Suite robots may run at the same time. In this case, migrating the database of Automation Suite robots fails. In Argo CD, you can see two migration pods: one with a healthy status, one with an unhealthy status.
docs image

Solution

The database migration for Automation Suite robots is automatically retried, and renders successful. However, Argo CD does not update the status. You can ignore the unhealthy status.

The backup setup does not work due to a failure to connect to Azure Government

Description

Following an Automation Suite on AKS installation or upgrade, the backup setup does not work because of a failure to connect to Azure Government.

Solution

You can fix the issue by taking the following steps:

  1. Create a file named velerosecrets.txt, with the following contents:
    AZURE_CLIENT_SECRET=<secretforserviceprincipal>
    AZURE_CLIENT_ID=<clientidforserviceprincipal>
    AZURE_TENANT_ID=<tenantidforserviceprincipal> 
    AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal>
    AZURE_CLOUD_NAME=AzureUSGovernmentCloud
    AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster>AZURE_CLIENT_SECRET=<secretforserviceprincipal>
    AZURE_CLIENT_ID=<clientidforserviceprincipal>
    AZURE_TENANT_ID=<tenantidforserviceprincipal> 
    AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal>
    AZURE_CLOUD_NAME=AzureUSGovernmentCloud
    AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster>
  2. Encode the data in the velerosecrets.txt file as Base64:
    export b64velerodata=$(cat velerosecrets.txt | base64)export b64velerodata=$(cat velerosecrets.txt | base64)
  3. Update the velero-azure secret in the velero namespace, as shown in the following example:
    apiVersion: v1
    kind: Secret
    metadata:
      name: velero-azure
      namespace: velero
    data:
      cloud: <insert the $b64velerodata value here>apiVersion: v1
    kind: Secret
    metadata:
      name: velero-azure
      namespace: velero
    data:
      cloud: <insert the $b64velerodata value here>
    
  4. Restart the velero deployment:
    kubectl rollout restart deploy -n velerokubectl rollout restart deploy -n velero

Pods in the uipath namespace stuck when enabling custom node taints

Description

Pods in the uipath namespace are not running when custom node taints are enabled. The pods cannot talk to the adminctl webhook that injects pod tolerations in an EKS env.

Solution

To fix the issue, create a network policy to allow traffic into the admctl webhook from the cluster CIDR or 0.0.0.0/0.
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-all-ingress-to-admctl
  namespace: uipath
spec:
  podSelector:
    matchLabels:
      app: admctl-webhook
  ingress:
    - from:
        - ipBlock:
            cidr: <cluster-pod-cdr> or "0.0.0.0/0"kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-all-ingress-to-admctl
  namespace: uipath
spec:
  podSelector:
    matchLabels:
      app: admctl-webhook
  ingress:
    - from:
        - ipBlock:
            cidr: <cluster-pod-cdr> or "0.0.0.0/0"

Pods cannot communicate with FQDN in a proxy environment

Description

Pods cannot communicate with the FQDN on a proxy environment, and the following error is displayed:

System.Net.Http.HttpRequestException: The proxy tunnel request to proxy 'http://<proxyFQDN>:8080/' failed with status code '404'.System.Net.Http.HttpRequestException: The proxy tunnel request to proxy 'http://<proxyFQDN>:8080/' failed with status code '404'.

Solution

To fix the issue, you must create a ServiceEntry, as shown in the following example:
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: proxy
  namespace: uipath
spec:
  hosts:
  - <proxy-host>
  addresses:
  - <proxy-ip>/32
  ports:
  - number: <proxy-port>
    name: tcp
    protocol: TCP
  location: MESH_EXTERNALapiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: proxy
  namespace: uipath
spec:
  hosts:
  - <proxy-host>
  addresses:
  - <proxy-ip>/32
  ports:
  - number: <proxy-port>
    name: tcp
    protocol: TCP
  location: MESH_EXTERNAL

Provisioning Automation Suite Robots fails

Description

The failure occurs mainly on FIPS enabled nodes when using Azure Files with the NFS protocol.

During the Automation Suite on AKS installation, creating the PVC for Automation Suite Robots asrobots-pvc-package-cache fails.

Potential issue

This happens because the AKS cluster cannot connect to Azure Files.

For example, the following error message may be displayed:

failed to provision volume with StorageClass "azurefile-csi-nfs": rpc error: code = Internal desc = update service endpoints failed with error: failed to get the subnet ci-asaks4421698 under vnet ci-asaks4421698: &{false 403 0001-01-01 00:00:00 +0000 UTC {"error":{"code":"AuthorizationFailed","message":"The client '4c200854-2a79-4893-9432-3111795beea0' with object id '4c200854-2a79-4893-9432-3111795beea0' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/64fdac10-935b-40e6-bf28-f7dc093f7f76/resourceGroups/ci-asaks4421698/providers/Microsoft.Network/virtualNetworks/ci-asaks4421698/subnets/ci-asaks4421698' or the scope is invalid. If access was recently granted, please refresh your credentials."}}}failed to provision volume with StorageClass "azurefile-csi-nfs": rpc error: code = Internal desc = update service endpoints failed with error: failed to get the subnet ci-asaks4421698 under vnet ci-asaks4421698: &{false 403 0001-01-01 00:00:00 +0000 UTC {"error":{"code":"AuthorizationFailed","message":"The client '4c200854-2a79-4893-9432-3111795beea0' with object id '4c200854-2a79-4893-9432-3111795beea0' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/64fdac10-935b-40e6-bf28-f7dc093f7f76/resourceGroups/ci-asaks4421698/providers/Microsoft.Network/virtualNetworks/ci-asaks4421698/subnets/ci-asaks4421698' or the scope is invalid. If access was recently granted, please refresh your credentials."}}}

Solution

To overcome this issue, you need to grant Automation Suite access to the Azure resource

  1. In Azure, navigate to the AKS resource group, then open the desired virtual network page. For example, in this case, the virtual network is ci-asaks4421698.
  2. From the Subnets list, select the desired subnet. For example, in this case, the subnet is ci-asaks4421698.
  3. At the top of the subnets list, click Manage Users. The Access Control page opens.
  4. Click Add role assignment.
  5. Search for the Network Contributor role.
  6. Select Managed Identity.
  7. Switch to theMembers tab.
  8. Select Managed Identity, then select Kubernetes Service.
  9. Select the name of the AKS cluster.
  10. Click Review and Assign.

AI Center provisioning failure after upgrading to 2023.10

Description

When upgrading from 2023.4.3 to 2023.10, you run into issues with provisioning AI Center.

The system shows the following exception, and the tenant creation fails: "exception":"sun.security.pkcs11.wrapper.PKCS11Exception: CKR_KEY_SIZE_RANGE

Solution

To resolve this issue, you need to perform a rollout restart of the ai-trainer deployment. To do this, run the following command:
kubectl -n uipath rollout restart deploy ai-trainer-deploymentkubectl -n uipath rollout restart deploy ai-trainer-deployment

Unable to launch Automation Hub and Apps with proxy setup

Description

If you use a proxy setup, you may run into issues when trying to launch Automation Hub and Apps.

Solution

You can fix the issue by taking the following steps:

  1. Capture the existing coredns configmap from the running cluster:
    kubectl get configmap -n kube-system coredns -o yaml > coredns-config.yamlkubectl get configmap -n kube-system coredns -o yaml > coredns-config.yaml
  2. Edit the coredns-config.yaml file to append the fqdn rewrite to the config.
    1. Rename the configmap to coredns-custom.
    2. Add the following code block to your coredns-config.yaml file. Make sure the code block comes before the kubernetes cluster.local in-addr.arpa ip6.arp line.
      rewrite stop {
                  name exact <cluster-fqdn> istio-ingressgateway.istio-system.svc.cluster.local
              }rewrite stop {
                  name exact <cluster-fqdn> istio-ingressgateway.istio-system.svc.cluster.local
              }
    3. Replace <cluster-fqdn> with the actual value.
    Once you have completed these steps, your file should resemble the following sample:
    apiVersion: v1
    data:
      Corefile: |
        .:53 {
            errors
            log
            health
            rewrite stop {
                name exact mycluster.autosuite.com istio-ingressgateway.istio-system.svc.cluster.local
            }
            kubernetes cluster.local in-addr.arpa ip6.arpa {
              pods insecure
              fallthrough in-addr.arpa ip6.arpa
            }
            prometheus :9153
            forward . /etc/resolv.conf
            cache 30
            loop
            reload
            loadbalance
        }
    kind: ConfigMap
    metadata:
      name: coredns-custom
      namespace: kube-systemapiVersion: v1
    data:
      Corefile: |
        .:53 {
            errors
            log
            health
            rewrite stop {
                name exact mycluster.autosuite.com istio-ingressgateway.istio-system.svc.cluster.local
            }
            kubernetes cluster.local in-addr.arpa ip6.arpa {
              pods insecure
              fallthrough in-addr.arpa ip6.arpa
            }
            prometheus :9153
            forward . /etc/resolv.conf
            cache 30
            loop
            reload
            loadbalance
        }
    kind: ConfigMap
    metadata:
      name: coredns-custom
      namespace: kube-system
  3. Create the coredns-custom configmap:
    kubectl apply -f coredns-config.yamlkubectl apply -f coredns-config.yaml
  4. Replace the volume reference from coredns to coredns-custom in the coredns deployment in kube-system namespace:
    volumes:
      - emptyDir: {}
        name: tmp
      - configMap:
          defaultMode: 420
          items:
          - key: Corefile
            path: Corefile
          name: coredns-custom
        name: config-volumevolumes:
      - emptyDir: {}
        name: tmp
      - configMap:
          defaultMode: 420
          items:
          - key: Corefile
            path: Corefile
          name: coredns-custom
        name: config-volume
  5. Restart the coredns deployment and ensure the coredns pods are up and running without any issues:
    kubectl rollout restart deployment -n kube-system corednskubectl rollout restart deployment -n kube-system coredns
  6. You should now be able to launch Automation Hub and Apps.

Installation fails when Velero is enabled

Description

The Automation Suite installation might fail when Velero is enabled.

Solution

To fix the issue, take the following steps:

  1. Make sure Helm 3.14 runs on the jumpbox or laptop used for installing Automation Suite.

  2. Extract the configuration values of the failed Helm chart, which in this case is Velero:

    helm -n velero get values velero > customvals.yamlhelm -n velero get values velero > customvals.yaml
  3. Add the missing image pull secret in the customvals.yaml file, under the .image.imagePullSecrets path:
    image:
      imagePullSecrets:
      - uipathpullsecretimage:
      imagePullSecrets:
      - uipathpullsecret
  4. If Velero has already been installed, uninstall it:

    helm uninstall -n velero velerohelm uninstall -n velero velero
  5. Create a new file called velerosecrets.txt. Populate it with your specific information, as shown in the following example:
    AZURE_CLIENT_SECRET=<secretforserviceprincipal>
    AZURE_CLIENT_ID=<clientidforserviceprincipal>
    AZURE_TENANT_ID=<tenantidforserviceprincipal> 
    AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal>
    AZURE_CLOUD_NAME=AzurePublicCloud
    AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster>AZURE_CLIENT_SECRET=<secretforserviceprincipal>
    AZURE_CLIENT_ID=<clientidforserviceprincipal>
    AZURE_TENANT_ID=<tenantidforserviceprincipal> 
    AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal>
    AZURE_CLOUD_NAME=AzurePublicCloud
    AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster>
  6. Encode the velerosecrets.txt file:
    export b64velerodata=$(cat velerosecrets.txt | base64)export b64velerodata=$(cat velerosecrets.txt | base64)
  7. Create the velero-azure secret in the velero namespace. Include the following content:
    apiVersion: v1
    kind: Secret
    metadata:
      name: velero-azure
      namespace: velero
    data:
      cloud: <put the $b64velerodata value here>apiVersion: v1
    kind: Secret
    metadata:
      name: velero-azure
      namespace: velero
    data:
      cloud: <put the $b64velerodata value here>
  8. Reinstall Velero:

    helm install velero -n velero <path to velero - 3.1.6 helm chart tgz> -f customvals.yamlhelm install velero -n velero <path to velero - 3.1.6 helm chart tgz> -f customvals.yaml

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.