Automation Suite

2023.10

False

EKS/AKS 上的 Automation Suite 安装指南

上次更新日期 2024年4月19日

故障排除

Automation Suite Robot 的运行状况检查失败

描述

在 AKS 上安装 Automation Suite 后，当您检查 Automation Suite Robot Pod 的运行状况时，它会返回运行状况不佳的状态：“[POD_UNHEALTY] Pod asrobots-miigrations-cvzfn 在命名空间 uipath 中处于故障状态”。

潜在问题

在极少数情况下，Orchestrator 和 Automation Suite Robot 的数据库迁移可能会同时运行。在这种情况下，迁移 Automation Suite Robot 的数据库将失败。在 Argo CD 中，您可以看到两个迁移 Pod：一个处于正常运行状态，另一个处于不运行正常状态。

解决方案

Automation Suite Robot 的数据库迁移将自动重试，并且呈现成功。但是，Argo CD 不会更新状态。您可以忽略运行状况不佳的状态。

启用自定义节点污点时，UiPath 命名空间中的 Pod 卡住

描述

启用自定义节点污点时，UiPath 命名空间中的 Pod 不会运行。Pod 无法与在 EKS 环境中注入 Pod 容忍的 adminctl Webhook 通信。

解决方案

要解决此问题，请创建网络策略以允许从集群 CIDR 或 0.0.0.0/0 进入 admctl Webhook 的流量。

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-all-ingress-to-admctl
  namespace: uipath
spec:
  podSelector:
    matchLabels:
      app: admctl-webhook
  ingress:
    - from:
        - ipBlock:
            cidr: <cluster-pod-cdr> or "0.0.0.0/0"kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-all-ingress-to-admctl
  namespace: uipath
spec:
  podSelector:
    matchLabels:
      app: admctl-webhook
  ingress:
    - from:
        - ipBlock:
            cidr: <cluster-pod-cdr> or "0.0.0.0/0"

Pod 无法在代理环境中与 FQDN 通信

描述

Pod 无法在代理环境中与 FQDN 通信，并显示以下错误：

System.Net.Http.HttpRequestException: The proxy tunnel request to proxy 'http://<proxyFQDN>:8080/' failed with status code '404'.System.Net.Http.HttpRequestException: The proxy tunnel request to proxy 'http://<proxyFQDN>:8080/' failed with status code '404'.

解决方案

要解决此问题，您必须创建 ServiceEntry，如以下示例所示：

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: proxy
  namespace: uipath
spec:
  hosts:
  - <proxy-host>
  addresses:
  - <proxy-ip>/32
  ports:
  - number: <proxy-port>
    name: tcp
    protocol: TCP
  location: MESH_EXTERNALapiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: proxy
  namespace: uipath
spec:
  hosts:
  - <proxy-host>
  addresses:
  - <proxy-ip>/32
  ports:
  - number: <proxy-port>
    name: tcp
    protocol: TCP
  location: MESH_EXTERNAL

配置 Automation Suite Robot 失败

描述

将 Azure 文件与 NFS 协议一起使用时，故障主要发生在已启用 FIPS 的节点上。

在 AKS 上安装 Automation Suite 期间，为 Automation Suite Robot asrobots-pvc-package-cache 创建 PVC 失败。

潜在问题

发生这种情况是因为 AKS 集群无法连接到 Azure 文件。

例如，可能会显示以下错误消息：

failed to provision volume with StorageClass "azurefile-csi-nfs": rpc error: code = Internal desc = update service endpoints failed with error: failed to get the subnet ci-asaks4421698 under vnet ci-asaks4421698: &{false 403 0001-01-01 00:00:00 +0000 UTC {"error":{"code":"AuthorizationFailed","message":"The client '4c200854-2a79-4893-9432-3111795beea0' with object id '4c200854-2a79-4893-9432-3111795beea0' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/64fdac10-935b-40e6-bf28-f7dc093f7f76/resourceGroups/ci-asaks4421698/providers/Microsoft.Network/virtualNetworks/ci-asaks4421698/subnets/ci-asaks4421698' or the scope is invalid. If access was recently granted, please refresh your credentials."}}}failed to provision volume with StorageClass "azurefile-csi-nfs": rpc error: code = Internal desc = update service endpoints failed with error: failed to get the subnet ci-asaks4421698 under vnet ci-asaks4421698: &{false 403 0001-01-01 00:00:00 +0000 UTC {"error":{"code":"AuthorizationFailed","message":"The client '4c200854-2a79-4893-9432-3111795beea0' with object id '4c200854-2a79-4893-9432-3111795beea0' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/64fdac10-935b-40e6-bf28-f7dc093f7f76/resourceGroups/ci-asaks4421698/providers/Microsoft.Network/virtualNetworks/ci-asaks4421698/subnets/ci-asaks4421698' or the scope is invalid. If access was recently granted, please refresh your credentials."}}}

解决方案

要解决此问题，您需要向 Automation Suite 授予对 Azure 资源的访问权限

在 Azure 中，导航到 AKS 资源组，然后打开所需的虚拟网络页面。例如，在本例中，虚拟网络为 ci-asaks4421698。
从“子网”列表中，选择所需的子网。例如，在本例中，子网为 ci-asaks4421698。
在子网列表顶部，单击“管理用户”。“访问控制”页面将打开。
单击“添加角色分配”。
搜索“网络参与者”角色。
选择“托管身份”。
切换到“成员”选项卡。
选择“托管身份”，然后选择“Kubernetes 服务”。
选择 AKS 集群的名称。
单击“审核并分配”。

升级到 2023.10 后 AI Center 配置失败

描述

从 2023.4.3 升级到 2023.10 时，您会遇到配置 AI Center 的问题。

系统显示以下异常，并且租户创建失败： "exception":"sun.security.pkcs11.wrapper.PKCS11Exception: CKR_KEY_SIZE_RANGE

解决方案

要解决此问题，您需要重新启动 ai-trainer 部署。为此，请运行以下命令：

kubectl -n uipath rollout restart deploy ai-trainer-deploymentkubectl -n uipath rollout restart deploy ai-trainer-deployment

无法使用代理设置启动 Automation Hub 和 Apps

描述

如果您使用代理设置，则在尝试启动 Automation Hub 和 Apps 时可能会遇到问题。

解决方案

您可以通过执行以下步骤来解决此问题：

从正在运行的集群捕获现有的 coredns 配置映射：

kubectl get configmap -n kube-system coredns -o yaml > coredns-config.yamlkubectl get configmap -n kube-system coredns -o yaml > coredns-config.yaml

编辑 coredns-config.yaml 文件以将 fqdn 重写附加到配置中。

将配置映射重命名为 coredns-custom。

将以下代码块添加到 coredns-config.yaml 文件。确保代码块位于 kubernetes cluster.local in-addr.arpa ip6.arp 行之前。

rewrite stop {
            name exact <cluster-fqdn> istio-ingressgateway.istio-system.svc.cluster.local
        }rewrite stop {
            name exact <cluster-fqdn> istio-ingressgateway.istio-system.svc.cluster.local
        }

将 <cluster-fqdn> 替换为实际值。

完成这些步骤后，您的文件应类似于以下示例：

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        log
        health
        rewrite stop {
            name exact mycluster.autosuite.com istio-ingressgateway.istio-system.svc.cluster.local
        }
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  name: coredns-custom
  namespace: kube-systemapiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        log
        health
        rewrite stop {
            name exact mycluster.autosuite.com istio-ingressgateway.istio-system.svc.cluster.local
        }
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  name: coredns-custom
  namespace: kube-system

创建 coredns-custom 配置映射：

kubectl apply -f coredns-config.yamlkubectl apply -f coredns-config.yaml

将 coredns 中的卷引用替换为 kube-system 命名空间 coredns 部署中的 coredns-custom：

volumes:
  - emptyDir: {}
    name: tmp
  - configMap:
      defaultMode: 420
      items:
      - key: Corefile
        path: Corefile
      name: coredns-custom
    name: config-volumevolumes:
  - emptyDir: {}
    name: tmp
  - configMap:
      defaultMode: 420
      items:
      - key: Corefile
        path: Corefile
      name: coredns-custom
    name: config-volume

重新启动 coredns 部署，并确保 coredns Pod 正常运行：

kubectl rollout restart deployment -n kube-system corednskubectl rollout restart deployment -n kube-system coredns

您现在应该可以启动 Automation Hub 和 Apps。

此页面是否有帮助？

PREVIOUS配置流程应用程序安全性

下一个故障排除工具

获取您需要的帮助

了解 RPA - 自动化课程

UiPath Community 论坛